REPORT 



RESUMES 



ED 018 162 48 FL 000 sss 

AN EXPERIMENTAL STUDY OF THE RELATIVE EFFECTIVENESS OF FOUR 
SYSTEMS OF LANGUAGE LABORATORY EQUIPMENT IN TEACHING FRENCH 
PRONUNCIATION. 

BY' YOUNG, CLARENCE W. CHOQUETTE, CHARLES A. 

COLGATE UNIV., HAMILTON, N.Y. 

PUB DATE 9 APR 63 

CONTRACT OEC-SAE-8791 

EDRS PRICE MF-S0.50 HC-S4.64 114P. 

DESCRIPTORS- «AUDIO PASSIVE LABORATORIES, «AUDIO ACTIVE 
LABORATORIES, ^FRENCH, «AUDIO ACTIVE COMPARE LABORATORIES, 
^PRONUNCIATION INSTRUCTION, TAPE RECORDERS, COMPARATIVE 
ANALYSIS, EQUIPMENT UTILIZATION, ENUNCIATION IMPROVEMENT, 
EQUIPMENT EVALUATION, LANGUAGE LABORATORIES, LANGUAGE 
LABORATORY EQUIPMENT, LANGUAGE LABORATORY USE, SECOND 
LANGUAGE LEARNING, AUDIO EQUIPMENT, LISTENING SKILLS, 

ANALYSIS OF VARIANCE, SPEECH SKILLS, COLGATE UNIVERSITY, 
HAMILTON, NEW YORK, 

A SERIES OF SEVEN EXPERIMENTS TESTED THE RELATIVE 
EFFECTIVENESS OF USING FOUR TYPES OF LANGUAGE LABORATORY 
EQUIPMENT FEATURING INACTIVATED OR ACTIVATED FEEDBACK (IF OR 
AF) OR LONG OR SHORT DELAY PLAYBACK (LD OR SD) IN LEARNING TO 
PRONOUNCE FRENCH. AFTER PRELIMINARY EXPERIMENTATION, THREE 
REPLICATION EXPERIMENTS WERE CONDUCTED WITH PAID JUNIOR HIGH, 
SENIOR HIGH, AND COLLEGE PARTICIPANTS WHO HAD NEVER STUDIED 
FRENCH. PROCEDURES CHARACTERISTIC OF EACH EXPERIMENT WERE 
6-DAY TRAINING SESSIONS CONSISTING OF CLASSROOM INSTRUCTION 
FOLLOWED BY LABORATORY PRACTICE WITH A SPECIFIC TREATMENT 
CONDITION AND PRE- AND POST-TESTING OF DAILY AND CUMULATIVE 
PRONUNCIATION MASTERY. A COMPARISON OF GROUP ACHIEVEMENTS 
BASED ON 24 ANALYSES OF THE EXPERIMENTS, PHONEMIC AND OVERALL 
PRONUNCIATION VARIABLES, AND PRE-, POST-, AND FINAL TRAINED 
AND UNTRAINED TESTS REVEALED THAT, DESPITE THE RELATIVELY 
CONSISTENT HIGH AND LOW AVERAGES FOR THE AF AND SD EQUIPMENT, 
NONE OF THE FOUR TREATMENTS PROVED TO BE SIGNIFICANTLY 
SUPERIOR TO THE OTHERS. HOWEVER, THE ANALYSES DID INDICATE 
FUTURE RESEARCH NEEDS, SIGNIFICANT DEFICIENCIES IN THE 
COLLEGE GROUP PERFORMANCE, THE USEFULNESS OF PRE-CLASSROOM 
INSTRUCTION, AND THE INEFFECTIVENESS OF EQUIPMENT WITH MINOR 
DEFICIENCIES IN SOUND QUALITY. (AB) 






j-'P 









y \ , 



U S DEPARIMENI OF HEALIh EDUCAIlON & WELFARE 
OFFICE OF EDUCATION 

THIS DOCUMENT HAS BEEN REPRODUCED 'XACTIY AS RECEIVED FROM THE 
PERSON OR ORGANIZATION ORIGINATING . POINTS OF VIEW OR OPINIONS 
STATED DO HOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUCATION 
POSITION OR POLICY 




s^mr or wb mmtjvE rrrBc^ipiwi^# 



’ij 

/''.I-. , ■■■V^ "'• r ■■•: f Y' ^y. 

•w -,**! -:i' ■' K'- '// .« .. 

P-, ' "»‘i . r ' ■• . ■ : •’• ''' ^ 

■'; lV^''y^'?‘•■■; ■ *:' -V • \ ^ V''- 

'. ' .' ■•: '• . ^ \ '■< - w*^rv-i--v 

» -ac- V - -;, 1 i */•*.•. if ., • 'f. /<■■' ,v '•,•■ / '•, , 

. ■; ■ ■ ‘'-T- ,"<■. ' 



^ KJOR fY0f®IS OP lASSOftGK lABORAWffiX 

Taacpsp mmcm PRo^cuwitw 



d 



■'' '.../J'tiC,- 

^ ‘V' . 

;■ i'"' •' 



’ 4 ^'' v-' 



iir' V' 

' • I ^ ^ ** 






■.;:/'>■ 









IT ■'. ■''.' '■ ■ '■ '■■ ^v- 

,xf* , ■' . ■ * - ‘ i'^: ^ ♦,1.’ 5 

' -r > V'* ■ \kiyJ 4 ? 

•; ••■• . - '4 >, ir-*Nw**a ** 



V y»'- 



^•/ ’ 



fcv • ■■...: r.i 



SOuag aioa- cjharle# X. -^diiiwtl^- - * ";•■■'■’ 



, '•'ll. ? 7/. • '•< 

s^T’-; ’•'■ ■ '•• '•■'■■•■^ ■'■ . ■■ i ' 

^' y-h'^^ 'l*' -’.I*-' 

■ ' v,.S' 



> 






^ *.•*.*' n ■ \ w 




- V.'.vv'i /*i '•''l^’/■’^’''!^ -■ -'’ii?-'' 

■: ■'l, ' >•, ''■• -v.- 

■• ' !b’ > ■’-.a 



■v' *^ ' ' ' ■ / " '.' * ' * 

; V.' 

yy-*yy^j' . 
■,. '■ " ::yr -r- y ^ . 

•Vt ■ ' »' 

' ’■■■v.'' r’ ■'.-,/ j fii ’ 



' ' ' V iC'' v'P'''’''-' 

, • ' . *■ ... ' •■ , ■ ■ ■•■ :•<••..' 



1 * ' ' , . - •‘I* 



'' ; 



I ■ A- 1 ^: v^-' 

ft' ' )' O'- , ^ , 






/*■ "'- 



- -m 

I "» v; ..- 






■ ' V • *■ ' .< S- ■ 

W - ". V- . • »^ •■4 ■ - ,■-» * 

. ^ •- v'-- • -‘: 

V ' 4- 

•- '• \Vx ’ r 

’ ' ' ' ’ ' , • I T< . '*'* 

' ■ -'^y ■ y-r .■■■A-.i^. 



iy--. y 






l',. V 



‘ ft? **.'. , ■ ,". 



^ \ * 



Kr^«’* I"- . / ' V . ''‘U, s ' 



• "4. » 






i/i, 

'■. 4 * <*. 

i* ^ 

k . . * ' ' ■■ 






V- ’ -V, '■•■■■/ , 

f • .'•/■ ' 1 

■;‘i' •■».' >»•} k*r . .' >. /' -, - • 

X » 2l X < , ■ .., ■ »'4- ••V r »/4 ... * • • 



coif atn Onivorjiltr 

9r " 



m 



/ ' »'• ’’ * A 1 *,/ \ ’ 

Vv.-y V »/* ./ - •.. •, , 

■ .7;;- 



EKLCgf® 






^ i'K 

V > 

' i**.! 

k* 4 «. Att. 






X 



The research reported herein was performed pursuant 
to a contract with the U. S. Office of Education^ 
Department of Health, Education and Welfare, Na- 
tional Defense Education Act, Title VI, Contract 
Number ST^E-8791. 




IX 



FOREWORD 



The author’s gratefully acknowledge the assis- 
tance afforded them by the following consultants 
and advisers. 

In the fields of linguistics and language teach- 
ing: Simon Belasco, Marie-Jose Berlin/ Gabriel Cor- 
dova, James Perrigno, Alfred S. Hayes, James E. lan- 
nuci, Ivo Malan, Stanley Sapon, and Elwyn Sterling. 

In the fields of statistics and experimental 
design: Jack Finger, Donald L. Meyer, George B. 

Schlesser, and Harry Snyder. 

They are particularly grateful for the willing 
cooperation of the high schools of Hamilton, Madi- 
son, and Morrisville, New York and of the following 
officers of those schools who gave generously of 
their time in the selection and scheduling of sub- 
jects for the experiments: Andrew L. Lane, Rodney 

Pierce, Conrad Ruppert, Eugene smith, and Neil O. 
Wooley. 

They wish also to express thanks to the many 
individuals who worked conscientiously and effici- 
ently in carrying out the innumerable details of 
experimental procedure and statistical analysis 
as well as to their subjects whose friendly coopera- 
tion contributed so much to the success of the ex- 
periment. 

Finally, they wish to compliment Virginia B. 
Young for her fine work in the arduous task of 
typing this report. 



Ill 



TABLE OP CONTENTS 



TEXT 



I. ANALYSIS OP THE PROBLEM 

II. SEQUENCE OP EXPERIMENTS 

III. EQUIPMENT 

:IV. SUBJECTS 



V. MATERIALS AND PROCEDURES — 

Developmeni: and Selection of. Testing and. 
Training Procedures - — — — — — 

Classroom Training ————————— — ■ 

Schedule for Daily Training Sessions 

Testing and Training Materials — 

Scoring — . 

The Questionnaire 



VI. 



Page 

1 

9 

11 

17 

19 

19 

24 

27 

28 
30 
32 



RESULTS AND DISCUSSIONS 

The Questionnaire 
Results 

Discussion — — — — — 

The Replication Experiments ( Experiments, 4,5 and 7 

Results 

Discussion 

The Continuation Experiment f Experiment 6) —— 
Results 

Discussion - — — 
The Trial Experiment 
Results 
Discussion 

General Discussion and Suggestions for 
Further Research 

Activated vs* Inactivated Feedback 
Delayed Playback 

Standardizing the Testing of Pronunciation — 

Age Level and Pronunciation Aptitude — - 

Types of Research on Pronunciation —— — — - 



33 

33 

33 

36 

38 

38 

50 

58 

58 

59 
65 
65 
65 

67 

67 

68 

71 

72 
74 



VII. SUMMARY 



76 



APPENDIXES 

I. TESTS AND TRAINING PROGRAMS 

II. QUESTIONNAIRE 

III. ANALYSIS OP OPEN-ENDED ITEMS ON QUESTIONNAIRE 

IV. SUPPLEMENTARY TABLES — — 



80 

92 

93 
100 



iVN EXPERiriENTAL STUDY OP THE RELATIVE EFFECTIVENESS 
OP FOUR SYSTEMS OF LANGUAGE LABORATORY EQUIPMENT 
IN TEACHING FRENCH PRONUNCIATION 



by 

Clarence W# Young and Charles A. Cheque tte 



I. ANALXSIS OP ajHE .PR^>BI;»EM 

The past fifteen years have seen a rexnaricable growth of 
language laboratories for the teaching and learning of a second 
language. The essential teaching device of the language labora- 
tory is to permit students to hear and often to mimic or reply 
to recorded utterances in the second language. One of the ma- 
jor advantages of this teaching method is thought to be the 
better learning of pronunciation, since recordings can be made 
by native speakers of the second language, and the student may 
listen to and attempt to imitate their pronunciation in con- 
tinuous practice sessions. Another assumed advantage is that, 
by presenting the recordings through headphones, the student 
may be expected to hear the sounds more clearly than in the 
average classroom. 

Three systems of record playing equipment are in general 
use* (1) the so-called audio-passive system, in which the stu- 
dent imitates recorded material while listening through head- 
phones alone; (2) the audio-active system, in which the student 
hears his own voice electrically amplified as he speaks; (3) 
equipment which enables the student to record his imitation of 
a master sample, then play back both master sample and his imi- 
tation thereof for self-evaluation purposes. 

In commercially available versions of the third system, 
there is frequently a relatively long delay between actual re- 
cording and subsequent playback. Some teachers have felt that 
a drastic reduction in this delay, permitting the student to 
evaluate his response almost immediately, would be still more 
effective. Such a short delay playback arrangement may be 
designated as a fourth possible system. 

Audio-active systems are somewhat more expensive than 
audio-passive systems, and playback systems are considerably 
more expensive, since they involve the purchase of recording 
equipnent for each student. Such recording equipment may also 
be used to permit each student to select his own practice pro- 
gram;, but where, as is usually the case, a single program is 



presented to an entire class, the only possible advantage of 
the more expensive systems over the simple audio— passive system 
is better self -monitoring with regard to pronunciation. Pre- 
sumably the playback systems afford the student a better oppor- 
tunity to hear and correct his errors in pronunciation. There 
are obviously important econoini.c reasons for determining wheth- 
er or not the more expensive systems are actually more effec- 
tive. But, of course, there are equally obvious pedagogical 
reasons for determining which of the four possible systems is 
the more effective teaching device. 

The aim of the present study has been to determine which, 
if any, of the above four systems is most effective in teaching 
French pronunciation to beginning students during the first 
stages of learning. 

The terminology ordinarily employed to distinguish between 
these systems refers to differences in the mechanical arrange- 
ments. The nature of the proolem will be clarified if the ter- 
minology is more closely related to the effect on the student. 
Thus viewed, the problem is one of discovering the optimum sys- 
tem for correcting wrong pronunciation and for being reinforced 
for approximately right pronunciation. 

Both the audio-passive and audio-active arrangesnents pro- 
vide an immediate feedback system , which tells the student, 
while he is speaking, what auditory pattern has resulted from 
the action of his speech muscles. We may distinguish between 
these two systems by calling one the i nactivated feedback sys- 
tem, or Ig, system and the other the activated feedback system, 
or m system . 

In both the IP and the AP systems, the acoustic input to 
the cochlea is conducted both through the bones of the skull 
and through the system for receiving external sounds? that is, 
the meatus, tympanic membranes, ossicles, and oval window. 

The first type of input may be termed internal, the second ex- 
ternal. The chief difference between inactivated feedback and 
activated feedback is one of the ratio between internal and 
external input. With IP, the relative amount of external input 
is reduced, although it is never eliminated, since it is always 
possible to hear external sounds while wearing the headphones. 
The amount of reduction depends on the type of headphone used. 
Heavily padded headphones will effect a greater reduction than 
unpadded phones. Hence, hearing one*s voice in the IF condi- 
tion is more like ordinary hearing if headphones are not pad- 
ded. With the AP condition, the ratio depends on the amount 
of gain applied to the headphones. As the gain is increased, 
the proportion of external input is increased. Relative to 
the intensity of vocalization, the total intensity of input is 
also increased. 



3 



With respect to their effects on the student, therefore, 
the IP condition is best described as "reduced external feed- 
back" and the AP condition as "variable external feedback." 

Subjectively — at least according to our introspections — 
the localization of the sound varies with the ratio of external 
to internal input. In ordinary speech — unless there is a 
strong echo in the room — the voice appears to be vaguely lo- 
calized imiwediately in front of the face. Upon covering *^e 

vith the hands, the localxzation moves inward? that is, 
the sound seems to be localized within the head. In the IP con- 
dition with unpadded headphones, the sound is only slightly in- 
ward. Actually, the sound is more like the voice in ordinary 
speech than the voice with ears -jovered by hands. With the AP 
condition at low gain, the localization moves back to the posi- 
tion characteristic of ordinary speech, but as the gain in- 
creases, the sound is heard more and more in the headphones 
themselves. As the intensity reaches the level of unpleasant- 
ness, the sound seems to penetrate the ears and enter the head. 
As the sound "moves" from its position in the IP condition to 
the ordinary position, then to the headphones, and then into 
the ears and head, its ability to compete with other sounds for 
attention seems to increase. This may be a function of inten- 
sity, although low speech sounds may be heard in the headphones 
with what seems to us to be an increased salience, similarly, 
as the sound moves from the glottal i:egion to the forehead and 
to the headphones, it seems to grow clearer, and sound 
ences, or variations in intonation, seem to be more readily dis- 
criminated. As the gain is further increased and intensity 
reaches higher levels, however, clarity decreases tapidly. As 
might be expected, when the sound is localized in the region of 
the face, it sounds more like one's own voice than In Any other 

position. 

With relatively low gain, the activated feedback condition 
produces an impression similar to ordinary speech conditions. 
When, with higher gaihv the voice is heard in the headphones, 
it becomes more like the model that the student is imitating, 
and it may actually provide conditions for discrimination su- 
perior to those in ordinary speech. Actually, nothing is real- 
ly known about this, and we do not know what level of gain, if 
any, is optimal for activated headphones. At present, the at- 
tempt is made to set the gain so as to produce maximum comfort 
for the student. We know little of just how different the stu- 
dent's experience usually is under the IP and .AP conditions. 
Indeed, it is not beyond the range of possibility that, under 
ordinary laboratory conditions, the distraction and masking 
provided by the sound of neighboring voices may eliminate what- 
ever differences might otherwise exist between the effective- 
ness of th® activated and inactivated conditions. Because^ the 
microphones pick up other sounds than those of the speaker s 



4 



voice, this distraction tends to be greater for the activated 
condition unless unidirectional, close -speahing microphones 
with heavily padded headphones are used. 

Obviously, there are many specific ways in which both the 
inactivated and activated headphones may be used, and the dif- 
ferences between them exist on a quantitative continuum. Psy- 
chologically, as the student experiences them, inactivated 
feedback without padded headphones may be more like activated 
feedback with unpadded headphones and low gain than the latter 
is like activated feedback with heavily padded headphones and 
the gain turned to the point where the locus of sound is entire- 
ly in the headphones and ears. And the latter condition may be 
less like the condition of ordinary speaking than either of the 
two former conditions. 

There are many possible arrangements for both IP and AP, 
not all of which have been analysed above. It might well be 
that the most effective inactivated system would be superior to 
certain less effective activated systems, and vice, versa.. Or it 
might be that any reasonably good immediate feedback system is 
as effective as any other. There seems to be a tendency to as- 
sume that the AP condition must necessarily be superior to the 
IP condition. But the following factors may be suggested which 
might make for superiority in an IP system: 

(1) It might be that lack of self-consciousness about pro- 
nunciation is a positive factor in learning to pronounce through 
mimicry. The IP system, since it does not require speaking into 
a microphone and does not emphasize the sound of the speaker's 
voice, might tend to reduce self-consciousness. 

(2) The need to speak into a microphone complicates the 
situation. This makes for some distraction. There is also the 
need for adjusting gain to produce an optimal comfort. This 
can create problems and distractions* 

(3) The possibility cannot be excluded that bone-conduc- 
tion hearing gives a better cue to the accuracy of pronuncia- 
tion — for some phonemes at least — than internal plus exter- 
nal input. The psychologist author of this report# for example, 
believes that he can perceive diphthongization of vowels more 
readily with ears covered than without. 

The possible advantages of AP are, of course, more obvious. 
One clear advantage is that, with an activated hookup, the gain 
can be adjusted so as to produce many degrees of external input, 
from very low to the highest tolerable* In principle, it would 
be possible to find the optimal amount of external input and 
use that. Since an AP arrangement can reproduce the conditions 
of IP, the latter can bo superior to the former only because of 

o 

ERIC 



its greater simplicity and lower expense. 

Tlie above analysis is not intended to settle our problem in 
advance, but rather to demonstrate the impossibility of making 
a priori judgments by pointing to a few of the possible factors 
that might make for differences or for absence of differences in 
effectiveness between the IF and AP conditions. 

Correlative with our designation of the two immediate feed- 
back conditions as "inactivated feedback," or IF, and "activated 
feedback," or AF, we shall designate the two playback conditions 
as "long delay, " or LD, and "short delay, " or SD. Each of these 
conditions involves recording the student’s voice as he imitates 
the model. Hence some of the student “s learning is achieved in 
the condition of immediate feedback, whether IF or AF . The ef- 
fectiveness of any particular course of training with playback 
must, therefore, depend in part upon the particular system of 
immediate feedback with which it is combined. There are not 
only many possible immediate feedback arrangements, but there 
are many ways in which playback may be combined with immediate 
feedback. Obviously, a whole series of delay periods, from a 
fraction of a second up to several days can be introduced. It 
is possible, for example, that it would be more effective to de- 
lay playback until just before a practice session to give the 
most useful information on the aspects of the student’s pronun- 
ciation that need to be improved. A fairly obvious weakness of 
any system of training using playback is the fact that time spent 
listening to playback is subtracted from time that might be 
spent in pctiVe “.practice. However, the proportion of timet spent 
in practice to that spent in listening may be widely varied. 

For example, if forty minutes were to be spent in practice three 
times a week, it would be possible to record only the last four 
minutes and then to play it back at the beginning of the next 
practice session. This might enable the student to estimate 
his most characteristic successes and failures and provide him 
with more purposeful goals for the ensuing practice session. 

Or short delay practice might be made to occupy the first few 
minutes of each session, countless other temporal arrangements 
are obviously possible. 

The foregoing discussion should make it clear that no ex- 
periment could possibly be set up which could be certain of pro- 
viding a definitive test of the relative effectiveness of the 
four equipment systems unless it could be known in advance which 
particular arrangement is optimal for each system. We do not 
have this knowledge or anything approximating it. Furthermore, 
our investigation has been limited to the first few hours of 
training. A training system might be shown to be highly effec- 
tive for the first approach to language training, yet in the 
long run it might be of little value. On the other hand, a 
system might show little advantage at first, but on the basis 
of a slight superiority in some respect, its cumulative advan- 



6 



tage might be great. 

It follows that the mere finding of statistically signifi- 
cant differences between the treatments would not constitute a 
definitive determination of relative superiority, since changes 
in procedure or specific hinds of equipment might reverse the 
findings. If we let A, B, C, and D stand for the particular 
form of IP, AF, LD, and SD we might actually use, and Ax, Bx, 

Cx, and Dx stand for any other possible forms of each condition, 
the finding that A is significantly superior to B, C, and D 
would not prove it superior to Bx, Cx, and Dx, nor would it bring 
proof of the superiority of Ax to B, C, and D or to Bx, Cx, and 

Dx. 

The same restrictions on generalization apply with respect 
to the group of subjects selected for a given ex^riment. A | 

method that might be superior for one set of subjects might not | 

be superior for all other sets. Furthermore, different hinds of 
program, different schedules of study, as well as different 
amounts of total time spent might result in differences in the 
effectiveness of the four conditions. 

This hind of difficulty is faced by most experiments in 
the behavioral sciences, and no responsible behavioral scientist 
is lihely to come to a final generalized conclusion on the basis 
of a single experiment. An experiment gets us better acquainted i 

with a certain area of phenomena, narrows the range of uncer- 
tainty, and suggests strategic approaches to further experiments 
which will further narrow the range of uncertainty. At any , 

stage in the process of narrowing this range, practical decisions j 

must be made on the basis of best estimates of the true rela- 
tionships among variables. 

For example, prior to our experiment, there could be no 
basis for rejecting the hypothesis that the use of a short de- 
lay system is by far the best method of initiating the teaching 
of pronunciation. If the SD treatment had turned out to be 
markedly more effective than the others, this hypothesis would 
have received strong confirmation, and the range of uncertainty 
would have then been narrowed. Practically, such a finding 
would lead to decisions to produce short delay systems commer- 
cially and to test their usefulness more widely. There would 
still remain the following questions: 

(1) Does the short delay system maintain its superiority 
over a period of time? 

(2) Are there methods of employing the other equipment 
systems in a different way than they were employed in our ex- 
periments which make them as effective as or more effective 
than the SD system? 



(3) V)hat temporal combination of short delay with imme- 
diate feedback systems is most effective? 

(4) ^at particular short delay arrangements are most 
effective? 

As a matter of fact,, our experiment has not provided the 
finding hypothesized above. Hence ^ it has narrowed the range 
of uncertainty in a different way and raised a different set 
of questions. 

To summarize: The general problem approached in the present 

experiment contains too many variables to be tested in a single 
study, and the experiment cannot be expected to provide a final 
and certain answer to the practical problem of the best kinds 
of equipment systems and the best manner in which to employ 
them. The aim of the experiment is exploratory: To get some 

measure of the relative effectiveness of the systems under cir- 
cumstances designed to give each one an opportunity to display 
its merits. 

Various considerations lead to contrary theoretical assump- 
tions concerning the probable effectiveness of the four systems. 
It might be assumed that the immediate feedback conditions should 
be favored because immediate reinforcement or knowledge of re- 
sults is typically found to be superior to delayed reinforcement 
or knowledge of results in learning situations. The playback 
conditions, on the other hand, might be favored by the greater 
clarity or certainty of the knov*’ledge of results. Under imme- 
diate feedback, knowledge of results is received under the strain 
of actually speaking the utterance, and the attention may thus 
be distracted from the task of discriminating success and fail- 
ure. The short delay system used in our study played back the 
student *s voice within a second and a half after he began his 
utterance. Our hypothesis that this might be especially effec- 
tive was as follows? As the student speaks the utterance, he 
receives immediate knowledge of results. With less than a sec- 
ondW delay, he receives a presumably clearer Knowledge of re- 
sults. This, we hypothesized, might provide him with a doubly 
strengthened basis for eliminating errors and establishing cor- 
rect habits. He might immediately vary his mode of pronuncia- 
tion and discover exactly what motor patterns produced the best 
results, and his judgment of these results might be based on 
better listening conditions than are provided by immediate 
feedback alone. 

under the long delay condition, there is no opportunity to 
judge the success of a particular motor pattern of vocalization 
and correct it, since when the student hears his utterance, he 
has no way of remembering how he made it. Long delay, however, 
affords a particularly good opportunity for the student to ob- 



8 



SGifve tilio cliff©jr©ncG Ijetw©©!! his pronunciation ©nd th&t of the 
model. Short delay may be described as an attempt to secure 
the advantages of both Immediate feedback and playback: rela- 

tively clear and undistracted information combined with a rela- 
tively short interval between the response and the knowledge of 
results. A possible handicap for short delay is the fact that 
the student receives both immediate feedback infomation and 
playback information in short succession relative to a single 
response. It might be that making judgments based on two types 
of information could actually lead to some confusion. 

In spite of the possibility of this handicap we were in- 
clined.. prior to running our experiment, to expect great things 
of the SD condition, since it appeared to be favored both by 
short delay between response and knowledge of results and by^ 
greater clarity in knowledge of results. A major consideration 
in selecting the type of experiment we chose was to test the 
possibilities of this new arrangement for playback. Otheawise 
a comparison of the three other treatments under the condition 
of course teaching might have seemed preferable. In the absence 
of already established programs for short delay playback, how- 
ever, it was necessary to make a test involving a relatively 
short training period in a special experimental situation. 

As a by-product of our study, we sought to compare the 
rates of learning of junior high school students (seventh and 
eighth grades) , senior high school students (ninth, tenth, and 
eleventh grades) and college men. This was to test the common 
belief of language teachers that younger students are more apt 
in the learning of pronunciation. We also studied the problem 
of reliability and validity in the measurement of pronunciation. 




9 



II. SEQUENCE OP EXPERIMENTS 

The experiments were carried out with subjects who had no 
training in or experience with the French language and who were 
paid for their service. Ordinary laboratory teaching conditions 
were duplicated to the extent that the training occurred in the 
language laboratory v;ith groups of subjects. To restrict the 
experiments to the skill of pronunciation alone, the subjects 
had no knowledge of the meaning of the utterances they pro-*- 
nounced, nor were they shown the printed words until the ex- 
periment in which they served was completed. 

Seven experiments were run in the course of the study . ^ 

The first two of these were designed simply to develop testing 
and scoring procedures . The third was a trial experiment test- 
ing the effects of the IP, AF, and LD treatments only^ since 
the eguipment for the SD treatment had not yet been constructed. 
Thirty-four junior and senior high school students served as^ 
subjects in this third experiment. It was followed by a series 
of four experiments testing the effects of all four treatments 
in which the testing and training material was modified as a 
result of the experience gained in Experiment 3 • 

In Experiment 4, 28 senior high school students served as 
subjects. In Experiment 5 there were 28 college students, but 
one of them, in the SD group, dropped out aftser the third day. 
Twenty-eight junior high school subjects served in Experiment > 

7. Experiment 6 was a continuation of Experiment 5 with the 
same group of college subjects. It was designed to test the 
effects of a longer period of training. 

Experiments 3 through 7 constituted the entire series of 
Training Experiments. They are summarized in Table I. Experi- 
ment 3 will be called the Trial Experiment and the remaining 
four, the Main Experiments. Since Experiment 6 was a continua- 
tion of training with the group of college subjects in Experi- 
ment 5, it will be termed the Continuation Experiment. Since 
Experiments 4, 5, and 7 were exact replications of one another 
except for the subjects, they will be called the Replication 
Experiments. Essentially the same schedule was followed in all 
five Training Experiments. Except for Experiment 3# an Aptitude 
Test was given the first day. In Experiment 3, the subjects 
were practiced in handling the machines and mimicking French ut- 
terances for three days and then given an Aptitude Test the 
fourth day. Following the Aptitude test in all five experiments, 
the subjects were divided into treatment groups and given six 
Training Sessions on each of six days. On the Day following 
the six training days a Final Test, identical in content with 
the Aptitude Test was administered. 



ERIC 



TABLE X 



SUMMARY OP TRAINING EXPERIMENTS, SHOWING TYPE OP EXPERIMENT, 
NUMBER OP EXPERI14BNT, NUMBER OP INTRODUCTORY, TESTING, AND 
TRAINING DAYS, AND TYPES OP SUBJECTS. 



Key for "Hunibor of Days" 


columns : 


lAT: 

TSs 

PT» 


Introduction and Apti- 
tude Test 
Training Sessions 
Final Test 


fvrsd of ExDeriment 




No. Days 
lAT TS 


JE3L~ 


Tvne Sublect „ 


Trial Experiment 


3 


4 


6 


1 


Jr.^Sr. H.S. 


Main Experiments: ^ 


Replication Experiment 


4 


1 


6 


1 


SGnXOXT llaSe 


M •• 


5 


1 


6 


1 


College 


H 


7 


1 


6 


1 


Junior H.S. 


Continuation “ 


6 


1 


6 


1 


college 



(The expejriroents are nuxitbereiS in t:he order in vhich they were 
carried out.) 



11 



II1„ EQUIPMENT 



Space Arrangements in Laboratory^ 

The experiment was performed in a laboratory containing 34 
semi -isolated booths in five ranks, seven booths in each rank 
except the first which contained six. The control center was 
located across the front of the laboratory, separated from it 
by a partition composed of cinder blocks to the height of four 
feet with plexiglas continuing sixteen inches above. 

The laboratory outside the control center was approximate- 
ly 24 feet long, 19 feet wide, and 7 feet 6 inches high. The 
ceiling was covered with sound-absorptive 3/4 inch acoustical 
tile. In each booth, the microphone and recorder stood on a 
formica -topped table 24 inches deep. The booths were 27 inches 
wide, separated from one another by half -inch plyw^d parti- 
tions. The partitions were 23 inches high and 29 inches long, 
so that they projected 5 inches beyond the edge of the table 
where the subject sat. The backs of the booths were of plexi- 
glas. Extending from the back, an 18 inch wide strip of plexi- 
glas covered the top of each booth. 

This arrangement provided a considerable degree of isola- 
tion, and at the same time allowed the experimenter at the con- 
trol center to watch and communicate with the subjects. Com- 
munication from the experimenter was secured with a microphone 
fed into the machine that played the program tape, and communi- 
cation from subjects was achieved through hand signals. 



List of Equipment and Specifications 

Wollensak T-1500 tape recorders (used for IF, AP, and LD) 
Wollensak T-1515-4 tape recorders (used for SD) 

Wollensak tape recorder microphones (subjects’ microphone) 
Revere T-202 tape recorders (used for; program. tapes * 
and producing scorers* tapes) 

Ampex 351 tape recorders (used in recording program tapes and 
scorers' tapes) 

Electro-Voice 664 cardioid microphone (used in recording pro- 
gram tapes) 

Heath EA-1 audio amplifiers (used for mixing inputs and driving 
earphones in short delay mechanism) 

Shure TR5B-J magnetic recording head (used for delayed play- 
back pickup in short delay mechanisms) 

General Electric UPX-003B pre-amplifier (used for pre-amplifi- 
cation of Shure recording head output; 

Viking AS-75 amplifiers (used to provide "activation" of sub- 
jects' earphones) ^ 

Military HS-33 600-ohm magnetic earphones (used as subjects 



o 



1-2 




Pig, 1, Diagram og individual booth wir inib- (§SS. 
for deaeription^ 



earphones) , 

Scotch 311 Tenzar recording tape 
and for SD recording) 

Scotch 190 Acetate recording tape 
cordings for IF# AF# and IiD) 
scotch 111 Acetate recording tape 



(used as program tapes 
(used for subject re- 
(used for scorers * tapes) 



Frequency response: 



listening*“*"60*"ll# 300 cps within 2 db* 
recording— 60- 7,500 cps within 2 db. 
(high frequency response limited by 
microphone) 



Noise and hum: minimum of 50 db. below saturation 

recording level and at least 55 db. 
below normal playback level for IF# 

AF and LD. 22 db. below normal play- 
back level for SD. 



Distortion: 



Wow and flutter: 



Tape speed: 



conibined harmonic and IM below 2 • 5% of 
total sound energy within specified fre- 
quency range 

0.23% (deviation less than +.03% for any 
machine) 

standardized at iH ips 



Wiring System to Booths. 

The audio transmission system used 600-ohra balanced lines 
throughout. Greater than 55 db. isolation between any two 
pairs was provided through grounded shielding of floa-ing ba- 
lanced pairs. Earth ground was established at one point on y 
and all shield drain wires and jacks were isolated. 
the system was not in excess of +4 VU where zero TO is e^iva- 
lent to one milliwatt across a 600-ohm resistive load. Major 
wiring was provided by Belden 8766 cable. 



Booth Wiring 

The following description applies to the wiring for 
Main Experiments. In the Trial Experiment, the signal fr^ the 
student's microphone was fed through the Wollensak ampli fi er 
for purposes of activation, with a resulting mismatch between 
the microphone and tape recorder input. This was ^o 

produce a sound that was inferior to both the sound from the 
program tapes and from the LD playback. To e(^alize conditions 
for activation. Viking AS-75 amplifiers were introduced. 

Individual booth wiring is represented in Pig. 1. Switch 
31 provided two alternative input sources both of which termi- 
nated at the control center. For AP, the chosen input fed si- 
multaneously the Viking amplifier and the tape 
The wollensak microphone also 8^™"!*=®"®°"®^^ fed both inpute^ 
through a resistive balancing network. In this situation, the 
student heard both the program and his own recording voice. 

For LD and for tests the wollensak recorded both the program 
source and the subjects 'response. By shunting the Viking mi- 
crophone input with switch S?, activation was eliminated. For 
LD playback, the headphones were removed from the yiking (JIJ 
and inserted in the wollensak external speaker jack (cf?) « 

Thus, through combinations of the above conditions, the p, 
AP, and LD situations were all made available. 
system was the same in the SD booths, and the SD playback could 

be switched in or out. 



o 



Fig, 2 . Short delay playback system . 



T he Short Delay Playback Unit 

The short delay playback system is shown in Fig. 2 
and diagrammed in Fig. 3. The basic recorder was the Wol- 
lensak T-1515-4. Attached to its right side was a perfo- 
rated masonite plate on which was fixed a series of tape- 
directing rollers to carry the tape past the playback re- 
cording head. A Shure TR5B-J recording head (RH) for play- 
back pickup was mounted along the tape route and provided 
the delayed playback pickup. The playback head output was 
fed through a General Electric UPX-003B pre-amplifier (GE) 
(modified for standard NAB tape head equalization) to a 
junction containing the two other sources (program from 
control center and student ' s voice from Wollensak output) • 
This signal combination was mixed, equalized and then fed 
to the crystal input of the Heath EA-1 audio amplifier ^ 
(mounted directly beneath the perforated masonite plate) . 
The EA-1 output, terminated with an 8-ohm resistive load, 
then finally drove the subjects' HS-33 earphones at an ad- 
justable gain. The subjects' voice was recorded through 
the Wollensak recording head. The tape distance between 




Fig, 3 o Diagram of Short: delay: playback system. (See 
text for deacnLtotion.) 

P.At Pre-amplifier 
R.H: Recording head 



the recording head and the playback head was 11 inches, 
thus allowing a l^s second delay between the beginning 
of a subject's utterance and the playback of the utter- 
ance • 

t 

The quality of the sound system for the IP, AF. and 
r,j} groups was judged to be *'the best ever heard in a 
teaching laboratory", by a national authority on aowd 
systems who possessed a wide range of experience. Tne 
quality of the SD system was judged by the experimente.rs 
to be not as good, partly because of the relatxvely low- 
er elevation of signal over noise. This difficuH^ ap- 
peared even when the playback was switched out. The 
Sality was good enough, however, to permit clear dis- 
crimination of speech sounds at all times. To accommo- 



16 

date to other activities of the subjects, it was necessary to 
begin Experiment 4 before the SD equipment had been fully tested. 
Throughout the first four days of the experiment, breahdcswns in 
the SD equipment occurred which made it necessary to shift sub- 
jects to a standby equipment or to maHe adjustments during the 
experiment. 

In brief, the experimenters did not succeed in perfectly 
equating equipment conditions between the SD group and the 
others, and this may have produced some handicap for the SD 
condition. 




17 



IV. SUBJECTS 

The college students in Experiments 5 and 6 were secured 
from the student body of Colgate University, an all men’s school. 
The junior and senior high school students came from nearby high 
schools. The junior high school students for the trial experi- 
ment came from Madison High School, the senior high school stu- 
dents from Hamilton High School. Senior high school students 
from Morrisville High School served in Experiment 4 and junior 
high school students from the same school in Experiment 7. 

It was impossible to get a representative sample of students, 
since it v;as necessary to accept almost all those who volunteered. 
The volunteer?y tended to come from the more able and serious stu- 
dents in all b*:.<^ols. The tendency to get the more conscienti- 
ous students was a distinct advantage, since it was necessary to 
gain cooperation for a rigidly programmed and rather artificial 
pv•QQo^^ 2 :e which involved a somewhat monotonous series of repeti- 
tions. Most of the senior high school students were girls be- 
cause athletic activities made it impossible for the boys to par- 
ticipate . 



All prospective subjects were given Part II of the carrol- 
Sapon Modern Language Aptitude Test, which was chosen as a pre- 
liminary test of pronunciation aptitude on the basis of the 
statement in the Manual^ that "it tends to correlate highly with 
the ability to mimic speech sounds and sound combinations in 
foreign languages." 

In Experiment 3, subjects who scored below 17 on the MLAT 
tended to spealc with such lc3w voices that their utterajr^ces could 
not be scored, and they often failed to respond. For the Main 
Experiments, therefore, prospective subjects were given a voice 
test, and those failing to speak loudly enough as well as those 
scoring below 17 on the MLAT were rejected. This resulted in the 
rejection of five prospective subjects. 

It was necessary to drop some subjects from the statistical 
analysis of Experiment 3 as explained in the section on Develop- 
ment and Selection of Testing and Training Procedures. The re- 
sult was that only twenty-four of the original thirty-four sub- 
jects in Experiment 3 entered into the experimental analysis. 



^Carroll, John B. and Sapon, Stanley M. Manual for Modern 
Language Antitude Test . New Yorks The Psychological Corpora- 
tion, 1959. 



I8a 



TABLE II 

SUBJECTS IN EXPERIMENT 3 BY SCHOOL PLACEMENT, TREATMENT GROUP, 
SEX, AND SCORE ON THE SECOND PART OP THE MLAT. 







IF 


AF 


LD 


Dropped* 




Sex 


MLAT 


Sex 


MLAT 


Sex 


MLAT 


Sex 


MLAT 




M 


17 


P 


18 


P 


22 


F 


19 


Junior High 


M 


20 


P 


19 


F 


21 


P 


16 

ii 


School 


F 


21 


M 


23 


M 


20 


F 


15 




F 


24 


P 


22 


M 


18 


P 


13 




P 


21 


M 


18 






P 


23 




P 


29 


P 


29 


P 


28 


F 


18 


Senior High 


P 


24 


P 


17 


P 


23 


P 


24 


School 


P 


17 


P 


20 


P 


X8 


M 


21 












P 


24 


P 


17 
















P 


21 


MLAT Mean: 




21.6 




20.8 




21.8 




18.7 


arr ivm • 




3. as 




3.37 




2.74 




3.37 



^Dropped 5 Not used in statistical analysis. 



Table II shows the sex, MLAT scores, school placement and 
experimental treatment of all subjects in the Trial Experiment. 
Table III gives the same information for the four Main Experi- 
ments. It will be noted that, except for the college group 
which was all male, the treatment groups were approximately 
equalized with respect to the nuniber of subjects of each sex, 
although only in the junior high school group of Experiment 7 
was it possible to get an equal number of males and females 
in the entire experimental group. 



18b 



TABLE III 

. \ 

SUBJECTS IN MAIN EXPERIKiSNTS , BY EXPERIMENT, TREATI4ENT GROUP, 
SEX, AND SCORE ON THE SECOND PART OF THE MLAT. 



IP AP LD ‘ SD 





Sex 


MLAT 


Sex 




P 


18 


F 




F 


25 


P 


Experiment 4* 


F 


26 


F 




F 


19 


F 




F 


24 


F 




M 


22 


F 




M 


27 


M 


MLAT Mean: 




23.3 




MLAT SD: 




2.96 






M 


30 


M 




M 


27 


M 


Experiments 


M 


17 


M 


5 and 6: 


M 


24 


M 




M 


22 


M 




M 


21 


M 




M 


20 


M 


MLAT Mean: 




23.0 




MLAT SD: 




4.09 






F 


24 


F 




F 


23 


F 


Experiment 7: 


F 


19 


F 




M 


20 


M 




M 


25 


M 




M 


24 


M 




M 


17 


M 


MLAT Mean: 




21.7 




MLAT SD: 




2.82 





MLAT 


Sex 


MLAT 


sex 


MLAT 


17 


F 


29 


F 


26 


25 


F 


25 


F 


25 


22 


F 


26 


F 


22 


24 


F 


18 


F 


26 


23 


F 


24 


P 


20 


27 


M 


21 


F 


20 


25 


M 


18 


M 


19 


23.0 




23.0 




22.6 


3.2 




3.96 




2.82 


30 


M 


29 


M 


* 


27 


M 


28 


(M)* 


(26) 


24 


M 


25 


M 


25 


24 


M 


24 


M 


24 


29 


M 


22 


M 


23 


21 


M 


22 


M 


21 


19 


M 


19 


M 


20 


24.9 




24.1 




23.8 


3.76 




3.27 




3.24 


27 


F 


29 


F 


23 


22 


F 


25 


F 


23 


19 


F 


19 


P 


21 


20 


F 


17 


P 


18 


25 


M 


24 


M 


27 


24 


M 


17 


M 


21 


17 


M 


19 


M 


19 


22.0 




21.4 




21.7 


3.30 




4.28 




2.76 



*This subject dropped out after the third day of Experiment 5. 
The Mean and SD do not include this subject. 



19 



V. I4ATERIAIiS AND PROCEDURES 

Development and Selection of Testing and Training Procedures 

Considerable preliminary investigation preceded the final 
arrangement of testing and training procedures. The aim was to 
discover the kind of testing and scoring methods that would pro- 
vide reliable and valid measures of pronunciation, to find the 
most effective ways of quickly adapting naive subjects to the 
laboratory situation, and to determine the kind of programs that 
would be best adapted to the subjects* capacities for improving 
phonological accuracy over a short period of time. 

In several respects the capacxtxes for adjustment of junior 
high school students were found to be inferior to those of seni- 
or high school and college students. Tests of procedure were, 
therefore, confined largely to individuals in the younger group, 
with three to six individuals not later participating in the ex- 
periments being employed in each test. Some of the final ad- 
justments of programs and procedures were made on the basis of 
experience gained in Experiment 3, the Trial Experiment. 

The methods finally developed were adapted to the limita- 
tions of the least able junior high school students, and the 
following description of limitations applies to them, although 
in some cases the same limitations might also be found in the 
older groups. 

After attempts to train both younger and older groups to 
adjust the gain on their machines, start them, stop them, and 
rewind the tapes, it was decided to confine the subject s con- 
trol of the machines to the function of stopping them only. 

This was necessary because a complete record of each student's 
test recordings was required for each laboratory session, and^ 
it was essential that all students work under comparable condi- 
tions for hearing both their own voice and that of the model . 

A single error on the part of a student in the operation of his 
equipment could render his performance non-comparable with the 
others or destroy the data he provided for an entire session. 

Each experimental group, therefore, was furnished with a 
proctor who was highly experienced in the handling of labora- 
tory equipment. The proctors stood at all times at the ends of 
the rows. Prior to every test and practice session, material 
was played on the master tape so that subjects could indicate 
by raising their hands whether or not they were receiving prop- 
erly. Whenever failures occurred, the proctors immediately cor- 
rected them or, if necessary, called on the technician, who also 
served as one of the proctors. During the warm-ups preceding 
each test, the proctors passed behind each student to make sure 




20 



that the flicker light on his machine indicated the proper gain 
for recording. 

The experience gained in the Trial Experiment was of con- 
siderable value in indicating the degree of care on the part of 
the proctors required to keep all equipment operating properly 
at all times. Occasionally a machine failed to record during a 
test or recorded so poorly that the material could not be scored. 
This« along with recording failures caused by inadequacies in 
the subjects made it necessary to drop several subjects from the 
statistical analysis. 

Failure to record was due to failure of the switch to make 
the proper contact when the "Record" key was pressed. This dif- 
ficulty was overcome by the following procedures: 

1. Before the machines were started# the power was turned 
off at the master switch. Each proctor then passed along his 
row and snapped down the keys with a firm thrust. The power was 
then turned on, and ten seconds later the master tape was star- 
ted. 



2. hfter each test, the proctors wound the tapes back a 
short way and then played the last utterance or two to make cer- 
tain of the recording. In the rare cases when recording failed, 
the student was retested. Although this undoubtedly introduced 
a practice error, the amount of additional practice was slight 
compared with that provided by the Practice Sessions and warm 
ups. Furthermore, the few cases which did occur were about 
evenly balanced for the four experimental conditions. 

3. vJhenever a machine failed to record, it was removed 
for servicing and another one put in its place. 

In Experiment 3 some unscorable recordings of utterances 
were produced because a subject spoke in a weak voice or moved 
his mouth too close to or too far from the microphone, or be- 
cause the gain setting was poor for a particular test. The 
method of overcoming the first difficulty through careful selec- 
tion of subjects has already been described. The second was 
overcome by requiring all subjects to place the microphone at 
right angles to the mouth and speak into it as it barely touched 
the cheek. Constant vigilance on the part of the proctors main- 
tained adequate gain settings throughout the Main Experiments. 

As a result of the above precautions, the number of unscorable 
utterances for the Main Experiments was reduced to a fraction 
of one percent. 

The preliminary experiments appeared to give rather clear- 
cut indications that the optimum number of repetitions for a 
single utterance was two. Generally the subjects improved their 
mimicry on the second presentation of an utterance, but fre- 



21 



quently the third repetition was not as good as the second. 
Furthermore, the subjects themselves said they wanted a second 
chance at an utterance, but did not like to repeat it three 
times. This objection applied only to utterances that were not 
changed in any way. After two repetitions of a single syllable, 
two repetitions of the same syllable plus another one offered 
no difficulty. It was, therefore, possible to "build up" to a 
six syllable utterance according to the following pattern with- 
out introducing a falling off in improvement between the first 
and second repetitions of individual utterances: 

1 12 3 4 

1 12 3 4 

1 2 1 2 3 4 5 

1 2 1 2 3 4 5 

123 123456 

123 123456 

Variations on this pattern were therefore used in construct- 
ing testing and training materials. 

The programs were deliberately designed to repeat the same 
syllables over and over again. The aim was to give maximum op- 
portunity to improve phonological accuracy through self -correc- 
tion. Only two six-syllable sentences were introduced in each 
daily Practice Session, with the result that each of the twelve 
syllables occurred from 16 to 24 times in a Practice Session, 
depending on its position in the sentence. 

Partly to compensate for the monotonous effect of the repe- 
titions (as well as to provide a measure of daily progress) the 
subjects were given a Pre-test and Post-test before and after 
each Practice Session and were encouraged to strive to improve 
their pronunciation so as to do well on the Post-test. They 
were also encouraged to dp their best on both the Pre-test and 
Post-test. In general, the subjects were highly cooperative, 
and most of them appeared to be putting forth their best efforts 
at all times. 

Two methods of scoring to measure improvement in phonologi- 
cal accuracy were employed, the Overall and the Phonemic . The 
former scores were secured by rating the final presentation of 
the first three syllables of a sentence as well as the final pre- 
sentation of the entire six-syllable sentence for overall ap- 
proximation to the French phonological system, including intona- 
tion as well as correctness in the production of all the phonemes 
in the sentence. The Phonemic score was derived from two ratings 
of a single phoneme, located in the sentence. (See section on 
scoring, and also Appendix I.) Six target phonemes were selected 
for scoring, namely /©/, /y/» /^/ • /^/» /«€/» /6/* They 
were chosen because they generally demand the greatest 



(Numbers indicate suc- 
cessive syllables in a 
sixf syllable > sentence'.^). . \ 



22 



amount of phonetic adjustment on the part of an American speak- 
er. Two of the target phonemes were scored in each of the daily 
Training sessions, so that each was tested twice during the 
training. The training and testing utterances were selected so 
as to place each target phoneme at one time or another in the 
initial, medial, and final position. 

In planning the short-delay presentations, the question 
arose as to whether it would be better for the subject to hear 
the model's voice immediately before the "echo" - that is, the 
playback, of his mimicry in order to maximize discrimination of 
differences between them or to hear the echo immediately after 
his own mimicry so as to mimimize delay of information. The 
patterns for a single utterance embodying these two alternatives 

are as follows: 

(1) model's voice: S*s mimicry: model's voice: S's echo 

(2) model's voice: S's mimicry: S's echo 

After trying out these patterns, the second was chosen. 

The following assumptions stood in its favor: 

1. It was the simplest pattern and therefore likely to be 
the least confusing. 

2. It was the shortest and therefore permitted practice 
on the greatest number of utterances in a given period of time. 

3. Considerable experimental evidence exists to indicate 
the advantage of the shortest possible delay between performance 
and information as to success (or reinforcement) . Delay be- 
tween the model's voice and the echo should not be as serious 

a handicap as delay between mimicry and echo because the image 
of correct French pronunciation should be fairly well established 
by frequent repetitions throughout the period of practice of 
the French syllables involved. The incorrect aspects of a 
given echo should be readily discriminated in contrast with a 
fairly well established image. But to correct a particular 
error of pronunciation, S would need to compare the sound of 
the echo with his image of the particular motor pattern that 
produced the sound. The nearer in time the particular motor 
pattern to the particular sound, the more effective the com- 
parison might be. 

These assumptive considerations appeared to be justified 
by the experience of the psychologist experimenter in respond- 
ing to the two patterns. Subjectively# the second pattern 
seemed easier and less confusing. Errors in pronunciation with 
this pattern were readily detected in the echo, more clearly 
than from the immediate feedback occur ing during mimicry. Full 
perception of the errors took an instant of time following the 



o 



23 






a 



echo. This perception was followed by a set to correct the pro- 
nunciation. This set seemed to be ”firmed-up” and guided through 
Soaring the model * s voice repeat the utterance either as the 
whole or a part of the next utterance in the series. 

The actual mode of arranging for a delay between the end 
of the model's utterance and the echo to provide for S’s inter- 
vening mimicry has been described in the section on equipment. 

Another problem arose in programing the practice materials 
so as best to equalize the practice programs for the SD group 
and the others. Since SD required time for the echo, not as 
many utterances per unit of time could be programed for it as 
for the treatments in which only the mimicry was interposed be- 
tween one model utterance and the next. If longer programs were 
prepared for the IP, AF and LD treatments, however, a differen- 
tial factor of program design would be introduced. There seemed 
to be no way of equalizing the program design for all four treat- 
ments unless all four were given the same program. The decision 
was made, therefore, to adopt the solution that would be most 
convenient for administering the experiment in the Practice Ses- 
sions. The same practice programs were made for all treatments. 
The model's utterances were widely enough spaced to allow for 
the echo on SD, and the subjects in the other groups were in- 
structed to mimich each utterance two or three times, whichever 
seemed best to the subject himself. 

Only experimental tests could actually determine the ques- 
tion of whether this arrangement favored the SD treatment or the 
other three, since repeated mimicry of a single utterance might 
actually be advantageous. Ideally, the optimum method for each 
treatment should have been used, but in the absence of knowledge, 
the adoption of the most convenient solution seemed justified. 



To provide for playback on LD, the practice program was re- 
peated each day, and while the other groups practiced it over 
again, the LD group listened to the recording of its first 
Practice Session. 

During the second day of training in Experiment 3, one of 
the subjects in the LD group began to repeat the utterances as 
she listened to the playback. She said she did so to try to 
correct the mistakes she heard and also to keep her attention 
from drifting. Although the morale of all groups appeared to 
be high, definite signs of inattention had been noted in the 
LD group during playback. To control the factor of attention, 
all LD subjects were instructed to repeat utterances as they^ 
listened to playback, and this instruction continued to be given 
throughout the series of Main Experiments. 

Since the AP condition is obviously considered a method of 



24 



improving conditions for seif-oorreotion, it was decided that 
the SD and LD groups would work with inactivated 

that each group would be used to test only one method of putative 
improvement . 

Preliminary work with subjects who were completely unfami- 
liar with the French language showed that they often ®omp e / 
misinterpreted the phonemes of utterances. ®' 

were meaningless to them. A common “^interpretation was s 
for "f“ or vice versa. The full English approximations to 
French vowel phonemes tended to occur, and these were often not 
the nearest approximations. Once an error had 

individual's perception of an utterance seemed to be fixed, and 
he continued in the error • 



This raised the problem of whether our aim should be to 
test the effectiveness of the four conditions in . 

individual to correct bad errors and to _ 

phonemes are different from English phonemes or „i_g-r ao- 

fectiveness of the varied conditions for achieving a closer ap 
proLmation to the French phonological system 

some phonological instruction. Our conclusion was that in near 
ly all teaching situations students would be ^ 

structions of some sort that would eliminate errors far out® 
the range of the French phonemes prior to or ’■ 

the laboratory. We therefore decided to introduce Classroom 
session immediately before each Laboratory Session 
the students with the two utterances in the training materia 

on that day. 
classroom TrainincL 

On the first day of each experiment, except Experiment 6, 
the classroom period was employed to inform the subjects as to 
the purpose and method of the experiment and to ^^1®“/" 

various laboratory procedures. They were ^^Y®” ^^® ® j^iL any 
the phoneme and told that no French phoneme ® 

American phoneme, hence that they should te 

try to imitate the exact French sounds, rather ^v 

thL into American sounds. This principle the „n- 

the difference between the dipthongized American /o/ 

diphthongized French /o/ as well as the ^i*^®^®”®® Y®^”®®"„2o- 
FrMch /r/ and the American /r/. They were then briefly intro 

duced to the difference between word and sentence 

French and American. These instructions ''®Y®. thrADtS 
naive Americanization of the French pronunciation in the Apti 
tude Test so that it might be a test of the student s best mimi 
C 3 fy prior to actual training* 

For the six Training Sessions, the classroom Session began 
with certain general instructions on French pronunciation. At 




25 



the ena of the general instructions, the instructor read through 
Se Pre-test. tL build-up preceding the Post-test, and the Post- 
test, with the class repeating each utterance 

Appendix I.) NO special emphasis was placed on the target pho 
nemes for the day, and the subjects never 
phonemes were. The students were thus acquainted 
ties material for the day under circumstances where they 
watch th^iSs of the moLl. and gross misinterpretations of the 
m^el utteraLes were thus eliminated. This procedure also made 
the Pre-tests, as compared with the Aptitude Test, a measure of 
the improvement effectuated by classroom 

the difference between the Pre-tes^ and Post-tests served^ 
as a measure of the improvement produced by laboratory practice. 

The nature of the general instruction given in first 

CcLnt S thrSLedure on the first training day. The aim was 
^riSrosS the Objects with the essential differences between 
American and French speakers in the posture and action of the 
vocal muscles. 

Attention was first called to the absence of gliding ^ 
diphthongizing in French vowels. The instructor went thr g 
thlfollowing series of pairs of English and French words, ask- 
Sg Se clasi triisten Lrefully to the differences between them, 



dear - dire 
tea - ti 
dough - dos 
steel “ style 



bah - bas 



paper - papier 
low - lot 



Then he had the class mimick him after each of the above words, 
Stio^ng thl to avoid the glide in the French member of each 

pair. 

Next the instructor called attention 
■•ping" of the French consonants, pointing .^’^® 
whLh the lips are kept firm and crisp. Calling ^^® 
note both the absence of glide in the vowels ^^® _ 

of the consonants, he went through the above list again and f 

lowed it by having the class mimick him. 

Finally, he called attention to the economy of breath and 
absence of aspiration after consonants and ®“f 
French tonciue position in pronouncing /t / and /d / avoias 
olratior Again calling attention to all three of the above 
features'of French pronLciation, he went through the above Eng- 

lish-French word pairs as betore. 



26 



To extend the subjects' practice of the French form of pos- 
ture and action the instructor finished the general training for 
the day with: 

"The following French words will be pronounced 
twice each for further illustration and imitation. 

As you watch the speaker's lips and tongue carefully# 
try to imitate the sounds accurately. Especially con- 
trol your voice and breath very carefully. Ready?" 

The following list was read: 

dis# tir# de» th^# dammer# tas# dot# tort# dos# tot# doux# 
tous# deux# teuton# du# tu. 

The instructor then presented the training sentences for 
the day as previously described. 

In the succeeding sessions, subjects were introduced to the 
whole range of French phonemes, specific tongue and lip posi- 
tions for specific phonemes were not taught (except as indica- 
ted above) . Instead# the general characteristics of French 
vocal posture and action were re-emphasized# and the subjects 
were left to learn for themselves through mimicry and self— cor- 
rection the particular values of each phoneme. Thus# the sub- 
jects were given a general approach to learning French pronun- 
ciation that was expected to enable them to make progress# but 
room was left to test the effectiveness of the four treatments 
to bring about improvement through self -correction# and coaching 
was not employed to produce specific phonemic accuracy. 

In sum# the first part of the Classroom Session was used 
to establish and re-establish at the beginning of each practice 
session a general set toward the correct pronunciation of French. 
The second part was used to introduce the subjects to the train- 
ing material of the day in a manner that would practically eli- 
minate cross misinterpretation of the French sounds and reduce 
difficulties in the organization of syllable sequences without 
affording specific instruction in how to produce either the tar- 
get phonemes or the others. 

During the last three of the six training days# short bits 
of conversation# the meaning of which was explained# were intro- 
duced into the general part of the training session to relieve 
the monotony of continually mimicking material that was meaning- 
less to the subject. This was not done earlier in order to es- 
tablish thoroughly the set toward purely phonological mimicry. 



o 



27 



Schedule for Daily Training Sessions 

The program for each training day for the Replication Ex- 
periments was as follows: 

1, Classroom Session (about 15 minutes, varied with the 

needs of the class on a particular day) . 

2. Laboratory Session (about 30 minutes). 

a. Warm-up and Pre-test Hh minutes) 

b. Practice Session (9 minutes) 

c. Rest pause (8 minutes) 

d. Repeat of Practice Session with LD listening 
to the recording of the model (9 minutes) • 

e. Warm-up and Post-test (1% minutes) 

(The times are approximate and varied slightly with each 
day's material.) 

The material used in the warm-up before the Pre-test each 
day was different from the practice material and was used to 
malce sure that the equipment was in working order, the gain 
settings correct, and the subjects accustomed to the situation 
before beginning the Pre-test. The warm-up before the Post-test 
involved build-ups of the two practice sentences and was de- 
signed to re -accustom the LD and SD groups to the conditions 
that the IP and AP groups had been working with to avoid any 
handicap to them that sudden changes in the form of practice 
might entail. (See Appendix I.) ) 

The Pre-test was given with all but the AP group in the 
ordinary IP condition. Between Pre-test and Training Series, 
the connections to the SD machines were changed to the short de- 
lay arrangement. During the rest session, the tapes on the LD 
machines were wound back to the beginning of the recorded part 
of the training session and set to play back into the earphones. 
After the second training session, both SD and LD equipment was 
set for the ordinary IP condition during the Post-test. These 
procedures, which were carried out by the proctors, provided a 
short rest between Pre-test and training sessions and between 
training sessions and Post-test. The entire procedure for a 
single day required about an hour unless special delays occurred. 

In Experiment 6, the Classroom Session was omitted to test 
its effect by comparing performance on Experiment 5 with that on 
Experiment 6. 

The differences between the program for Experiment 3 and 
that for the Replication Experiments will be indicated in the 
section on Testing and Training Materials. 




28 



Testinof and Training Materials. 

The testing and training materials both for the Trial Ex- 
periment (Experiment 3) and for the Replication Experxments 
(Experiments 3. 4 , and 7) as well as 

Experiment 6 are shown in Appendxx I. The ® . 

te^on test was employed in Experiment 6 as xn the Replxcatxon 

Experiments . 

Both test tapes and training tapes were produced by a na- 
tive French-speaking woman whose style of utterance was excep- 
tionally clear, crisp, and deliberate. Those who 
these tapes are agreed that they have never heard a model voxce 

that sounded easier to imitate. 

In the Main Experiments the Aptitude -Criterion test was ad- 

ministered as an Aptitude Test at the beginning ®®®^ 
ment before the treatment groups were separated and also as a 
Final Test after the six days of training. To avoxd any 
favoring either inactivated or activated condxtxons, the fxrst 
half was administered with activated headphones and, after a res^ 
of eight minutes, the second half with inactivated headphones. 

Each half was composed of a warm-up sentence and twelve 
scored sentences, all of six syllables each. Each of ^^® 
target phonemes appeared for scoring in four of ^^® 

These sentences were built up one syllable at 
syllable to six according to the pattern shown xn 

TMs style of buildup was selected ®^^®* J^i^ies of 

styles and finding them less well-adapted to the capacxtxes of 

the subjects. 

Every second sentence in the A-C test was used in the daily 
Training Sessions and was tested with exactly the ®®^® . 

buildup in the Pre-tests and Post-tests. Each of the sxx targ 
phonemes appeared for scoring in two of the twelve sentences of 

the Training Program. 

This procedure resulted in four kinds of tests that ®®*^^®^ 
as criteria of learning: (1) the sum of the scores for all sxx 
days on the Pre-tests, (2) the sum of all the scores on the Post 
tests, (3) the trained utterances on the Pxnal Tost and (4) the 

SS^inid uSrances on the Final Test («-®f 
as Pre-tests, Post-test?, Pinal Traxned, and Pinal Untraxned, 

respectively.) The Final Untrained would ®®f^®/the 

measure of the generalization or transfer of 

trained sentences to utterances that had not been practxced. 

As shown in the Appendix, the first three ^ 

utterance were presented one after another , each ^f®®”^®^. 

twice, at the beginning of the buildup. This was to permxt scor 



29 



ing the second occurrence of ohe of the three syllables for the 
correctness of a target phwieitie. All three syllables were then 
presented twice, and the second occurrence was scored for the 
correctness of a single phoneroe and for overall phonological cor- 
rectness • The fourth and fifth syllables were then added one at 
a time, and then the entire six-syllable utterance was repeated 
twice and the second occurrence scored for overall phonological 

accuracy. 

To avoid monotony, Ifhe first part of each Training Series 
was devoted to one-syllable utterances, unrelated to the two 
training scmtences, containing the six phonemes on which the 
week's training was concentrated. This, of c^rse, provided for 
some degree of generalised training. The variation in types of 
buildup and the alternation of utterances, which may be seen by 
inspection of the Training Series, were also aimed at avoiding 

monotony. 

In Bs:periment 6^ as wshown in Appendix I, four sentences 
were testcid and used in the Training Series each day. Hence, 
the entires set of sentences in the Aptitude— Criterion test were 
trained in this experiment. Otherwise, the laboratory procedure 
in Experiment 6 was the same as in the Replication Experiments . 

The entire Aptitude-Criterion test was given the first day, fol- 
lowed by six days of training with Pre- and Post-tests, and then 
the Aptitude-Criterion test was administered the final day. The 
first day's administration of the Aptitude-Criterion test must 
be viewed as a criterion, rather than an aptitude measurement. 

It actually measured degree of retention of pronunciation skills 
over a week of non-practice. It will, therefore, be referred to 
as the Introductory test. 

The testing and training materials employed in the Main 
Experiments , represented a rather complete revision of those used 
in Experiment 3. Since the differences can readily be observed 
by examination of the materials in Appendix 1, it will not be 
necessary to discuss them in detail. Briefly, the target pho- 
nemes were different because after scoring the Experiment 3 pho- 
nemes, it was decided that the phonemes should be changed to 
those offering the greatest difficulty. This necessitated a 
complete change of sentences. 

In Experiment 3, the Pre-tests were not announced as such 
but appeared to the subjects simply as part of the practice. 

There were two practice sessions, in the second of which the LD 
subjects heard their first session utterances played back. 

Then the "Review and Post-test" was administered to all three 
groups, but tha fact that they were being tested was not stressed 
as much as was later done in the Main Experiments. 



30 



Zn Experiment 3 $ the principle of confining practice to 
three-syllable utterances was not employed. The method used to 
give a maximum opportunity for hearing phonological errors and 
correcting them was (1) to present single syllables several 
times (2) to include three-syllable utterances in which there 
was a short pause between each of the three syllables. This 
last method actually appeared to increase difficulties. The 
subjects found it harder to remember the utterances when the 
model spoke them in this way# and also had trouble with motor co- 
ordination# since their natural rate of speech was different from 
that of the model. The failure of this meth^ led to the de- 
cision to use the three— syllable form of training in the Main 
Experiments . 



Scoring 

Two variables were scored. (1) the Phonemic variable. ( , Ph) _ # 
measuring the phonological correctness of single target phonemes 
and (2) the Overall variable (OA) # measuring the phonological 
correctness of an utterance as a whole. 

Two of the scorers (C and S) were experienced French 
language teachers with considerable special training in phone- 
tics. The third (B) was a native French speaker with three 
years experience teaching French to American students. C and S 
scored each utterance independently on the variable. C and B 
scored the OA variable in similar fashion. All scorers spent 
considerable time scoring together# comparing results# and ar- 
riving at agreement concerning their interpretation of the stan- 
dards before beginning to score for the Training Experiments. 

Each Titterance for all three variables was scored by ratings 
on a seven point scale that was found in the course of prelimi- 
nary trials to be most convenient for the raters . The standards 
of scoring were as follows: 

3.0: Almost native French 

2.5: Between 3.0 and 2.0 

2.0: Not correct# but more French than American 

1.5: Between 2.0 and 1.0 

1.0: Almost wholly American 

0.5; Between 1.0 and 0.0 

0.0: Badly garbled or wholly American 

To avoid halo effects and bias on the part of scorers 
through knowledge of the treatment or test that was being scored# 
the records made by the student were transcribed onto special 
scorers * tapes from a Revere recorder to an Ampex 500 in the 

following manner: 



iia 







31 



A random order was established for the utterances of the 
subjects on the Pre-tests and Post-tests combined and also for 
the Aptitude Test and Pinal Test combined. The utterances to 
be scored were transcribed onto the scorer's tapes in this ran- 
dom order, transcribing only six utterances from two successive 
sentences at a time. The utterances were identified for the 
scorers only in terms of their order on the scorers tapes. 

The scorers, therefore, heard and rated the one-syllable, 
three syllable, and six syllable utterances for a subDect for two 
successive sentences. They then heard and rated the same series 
of utterances for the next subject. They had no knowledge of t 
treatment group the subject belonged to or of whether the utter- 
ances came from the Pre-tests or the Post-tests in one case or 
from the Aptitude Test or Pinal Test in the other. 

Por each variable, therefore, the scorers rated only four 
utterances at a time. This was arranged both to prevent halo 
effect or stereotyped rating and to permit the scorers to con- 
centrate on a few utterances at a time. C, who scored J^®th i:ne 
Ph and OA variables, first went through each tape for the Pn 
variable and then played it through again for the OA variable. 

In order to achieve comparable scoring throughout Experi- 
ments 4 through 7, it would have been preferable to arrange the 
utterances for all four experiments and all four tests in random 
order, but considerations of time made this impracticable, since 
it was necessary for the scorers to rate the utterances on one 
experiment while the later experiments were in progress. 

After the scorers had completed their ratings, these ratings 
were punched on tabulating cards, with the ratings for one utter- 
ance only on each card, according to the following scale: 



the test was secured by summing the card numbers for ootn 
scorers. Thus the score on an item could vary from 2 to 14 



Scorer's Rating Number on Card 



3.0 

2.5 

2.0 

1.5 
1.0 
0.5 
0.0 



7 

6 

5 

4 

3 

2 

1 




ERIC 



32 



The Questionnaire 

Immediately after the Pinal Test in the four Main Experi- 
ments, a questionnaire, shown in Appendix II, was administered. 

It was designed to determine the general state of morale and mo- 
tivation throughout the experiment as well as to find what fea- 
tures of the procedure had tended to depress morale and what 
features had tended to improve it. 

The subjects were asked not to put their names on the ques- 
tionnaire, and they were encouraged to be as frank and objective 
as possible. To overcome the tendency toward kindness or po- 
liteness in responding to questionnaires, it was emphasized that 
critical cannments would be of genuine value to the experimenters 
in enabling them to correct their errors of procedure and that 
the only way in which such errors could be determined was through 
such a questionnaire as this. 




33 



VI, RESULTS AND DISCUSSIONS 



Since the entire investigation involved a considerable num- 
ber of complex procedures, it seems advisable to follow the re- 
port of results on each major part of the investigation with a 
discussion on that part. This section will begin, therefore, 
with the results of the questionnaire, which will throw some 
light on certain outcomes of the experimentation. A report on 
the Replication Experiments, which constituted the of the 

investigation, will follow. The Continuation Experiment and 
Trial Experiment will then be reported and these reports will 
jjQ followed by a general discussion. 



The Questionnaire 



Results 

Following an item which identified the treatment group, the 
questionnaire contained four choice response items which may be 
identified as follows: 

Item 2: Estimate of educational value 

Item 3: Rating of interestingness 

Item 4: Rating of boresomeness 

Item 5: Rating of effort 

These items were scored on the following scale running 
from high to low indices of motivation or morale: 



Score Item 

2 2c 3a 4a 5a 

1 2b 3b 4b 5b 

0 2a 3c 4c 5c 

•I 5d 

Mean scores per item are shown in Table II. To make their 
interpretation more meaningful, the following choices on the 
questionnaire are given together with their scores: 

Item 2: What I learned in this experiment 

c. Will definitely be of value to me (Score, 2) 
b. May be of some value to me (Score, 1) 

Item 1: a. All the work I did in this experiment was 

interesting (Score , 2 ) 

b. Some of the work I did in this experiment was 
interesting (Score, 1) 



TABLE III 



MEAN SCORES PER ITEM WITH CROSS MEANS FOR EXPERIMENT, 
TREATMENT AND ITEM ON THE CHOICE RESPONSE ITEMS OP THE 
QUESTIONNAIRE V7ITH LISTING OP SIGNIFICANT VARIANCES. 



Experiment X Treatment 




Experiment X Item 




IF 


AP 


LD 


SD 




It2 


It3 It4 


Its Mean 


Ex 4 


1.5 


1.4 


1.6 


1.5 




1.3 


1.7 1.2 


1.8 1.5 


Ex 5 


1.3 


1.3 


1.1 


1.3 




1.1 


1.2 1.0 


1.8 1.3 


Ex 6 


1.0 


1.3 


1.1 


1.3 




1.1 


1.0 1.0 


1.6 1.2 


Ex 7 


1.5 


1.6 


1.6 


1.7 




1.2 


1.9 1.4 




Mean 


1.3 


1.4 


1.4 


1.5 




1.2 


1.4 1.2 


1.8 


Treatment X 


Item 








sianificance ^ 


of Variances 




It2 


It3 


lt4 


Its 


Mean 








IP 


1.2 


1.4 


.9 


1.7 


1.3 


Treatments 


N.S. 


AP 


1.1 


1.5 


1.2 


1.8 


1.4 


Experiments 


P < .001 


LD 


1.2 


1.3 


1.2 


1.7 


1.4 


Items 


p < .001 


3D 


1.2_ 


1.4 


1.3 


1.9 


1.5 


X X 


E 


P < .001 


Mean 


1.2 


1.4 


1.2 


1.8 




I X 


E X T 


p < .001 



Item 4: a. None of the work was boring (Score, 2) 
"" b. Some of the work was boring (Score, 1) 



Item 5: a. I did my best almost all of the time (Score, 2)^ 

b. I did my best more than half the time (Score, 1) 

The means of the various responses to the items for the four 
groups combined show that most of the subjects claim to have done 
their best most of the time, about half say that all of the work 
was interesting and about half that some of the work was inter- 
esting. The responses cluster toward the statement that what 
was learned "may be of some value to me”, with a stronger ten- 
dency to select "will definitely be of value” than ”wi 11 never 
be of any value.” Similarly, the responses approximate some of 
the work was boring", with a greater tendency to say "none of 
the work was boring” than "some of the work was very boring. 

There is a significant difference between experimental ^ 
groups on the total score for the questionnaire, with the high 
school groups indicating a higher level of morale than the 
lege group. About half of this difference is accounted for by 
the greater degree of interest expressed by the high school 
groups. The average response for the college group is "some of 
the work was interesting.” The average for the high school 



35 



groups approaches "all of the work was interesting. On J:he 

other hand, the college group lays claim to about as much 
as the high school groups. These differences in response to the 
items account for the statistically significant irxteraction be- 
tween items and experiments. 

Examination of individual replies and the frequency of re- 
sponses between individual items and treatment groups in 
rate experiments suggest that the cause of the ^ . 

significant interaction between items, experiment, and treatment 
is the fact that a few subjects showed a different pattern of 
response to the items than was characteristic of 
mental group. As might be expected, there were real differences 
between individuals in the way they ranked the various items. 



No significant difference was found between treatments. 
There is a non-significant trend for the IF groups to find the 
experiment more boring and for the SD group to be generally 
higher than the others. 



To analyze the open-ended questions, the responses were 
placed in what appeared to be the most meaningful categories. 
No categories were included that did not contain at least five 
responses in all four groups, and the remaining responses were 
categorised as miscellaneous. Appendix III shows the cate- 
gories, together with the number of responses falling in each 
category by experiment and treatment. 



The responses shown in Appendix 111 were categorized into 
"high morale" responses and "low morale" responses, with t e re- 
sponses to the question "What bothered you so that you couldn t 
do your best work?" omitted because it was difficult to deter- 
mine whether these responses stemmed from high motivation ^ 
morale. The proportions of high morale responses for experi 
and treatments are as follows x 



Ex 4 — .621 
Ex 5 — .429 
Ex 6 — .552 
Ex 7 .732 



IP . 567 

AP .655 

LD — .521 

SD — * .600 



The higher morale of the high school as compared with the 
college subjects is confirmed by this analysis. The fact that 
the order of morale for the treatment groups changes frm that 
secured by analysis of the choice response questions points to a 
lack of any real difference in morale between treatments. 



36 



Discussion 

The finding of no significant morale differences between 
treatments is important since it eliminates the factor of morale 
or motivation in explaining any experimental differences in 
efficiency between treatments. 

In discussing the differences between experimental groups, 
the item number and category of response letter in Appendix III 
will be referred to as follows: (lA) would refer to the first 

response category of the first item, namely "Testing (usually 
pre -post-tests) or observing own progress." This item, the most 
frequently mentioned point of interest, points to the high level 
of achievement motivation which seems to have characterized all 
groups, and which is also indicated by the high scores on the 
fifth item of the choice response section. 

The repetitiousness of the practice sessions , which was de- 
liberately introduced for purposes of stressing training in pho- 
nological accuracy, appears to have been more distressing to the 
college group than to the high school groups, particularly the 
junior high school group, although the repetitious practice ses- 
sions were boring to some individuals in all groups (2A,2B,2D, 
3D,4D,5D). The rather elaborate checks to make certain the ma- 
chines were functioning properly at all times, together with the 
constantly repeated instructions (the former found necessary 
during the trial experiment and the latter adapted to the junior 
high school level of requirement) were irritating to some of 
the college group, but not at all to the junior high school 
group (2E,3B,5F). 

After observing the three groups, the experimenters have^ 
come to the conclusion that the process of mimicry itself is in- 
trinsically uninteresting to most students of college level, 
whereas it was intrinsically interesting to the junior high 
school group. The senior high school group seemed to stand in 
an intermediate position in this respect. Evidence for these 
conclusions is found in IB,IP,2C,3A, as well as in the lower 
degree to which the high school groups complained of repetitious- 
ness. The junior high school group appeared to be genuinely 
challenged by the problem of pronunciation. They felt it to be 
difficult, but worth struggling with (3P,4C). The senior high 
school group was not satisfied, as the junior group was, with 
mere mimicry. They wanted to learn the language, hence their 
greater appreciation of the classroom sessions, and their ob- 
jection to not seeing the written language or knowing the mean- 
ing of the sentences (1C,3E,5B,6D,6J) . Throughout the experi- 
ment, the senior high school group begged to know the meanings 
of the sentences, and they showed great satisfaction in learning 
them at the end in the course of a small party that was given 



37 



as a reward for their good work. The junior high school group 
paid little attention to the cards on which the meanings of the 
sentences were given, but throughout their j>atty dbfttihued*^ to 
babble the sounds that they had been practicing. 

It seems possible that the better progress that young chil- 
dren make in learning to pronounce a language is partly based on 
a greater intrinsic interest in the mere production of new sounds. 
Probably learning to pronounce through mimicry of a foreign voice 
is a more meaningful and interesting procedure the younger the 
student. 

For both groups of high school students, the whole experi- 
ence was something of a pleasant adventure. Coming to a speech 
laboratory in a university was an exciting novelty, especially 
for the senior high school students who thought of college at- 
tendance as the next great step in their course of growing up. 

The fact that the staff in this prestige ful place showed a friend- 
ly interest in them was genuinely pleasing, and this may in part 
have accounted for their special liking for the classroom ses- 
sions (1C,6C,6D). To both high school groups, the experience had 
many of the social as well as experiential values of a field trip. 
During the rest pauses, they bought soft drinks at the vending 
machine and had a good time generally? whereas the rest pauses 
for some of the college students were periods of boredom and im- 
patience to get on with the work (36, 6B) • 

There was little glamor in the experience for the college 
students. They were typically very busy young men who were not 
getting enough sleep (4E) » and they were eager to get the job out 
of the way as expeditiously as possible (36, 5P). Most of them 
did not plan ever to study French. They were uninterested in 
what they were learning, but many of them seemed to feel e genu- 
ine interest in the scientific and practical value of the work. ^ 
Hence, almost the only thing they found to like about the experi- 
ment was its efficiency and good planning (6A) , which it occurred 
to fully half of them to mention. There was every evidence that 
most of them were interested in doing a similer good, efficient 
job out of a feeling of both pride and obligation. Learning, 
however, probably goes on more effectively when the task is 
pleasant, and the college students may have been handicapped in 
learning by an absence of pleasurable feeling in spite of their 
willingness to invest effort. 

In closing, three minor outcomes of the questionnaire may be 
mentioned. The senior high school group was naturally somewhat 
disturbed by the occasional malfunctioning of the SD machines 
(3C,4F). The senior high school group more often mentioned out- 
side disturbance or self-consciousness about being heard as being 
bothersome (4B) • Finally, although the playback groups did not 
have substantially higher morale, some students mentioned hearing 
their own voices on playback as being interesting or helpful 
(1B,6H) . 



38 



The Replication Exp eriments ( Experiments 4, 5 and 7) 

Results 

In reporting the results of all experiments, the unit used 
will be the per item score from the tabulating cards multiplied 
by 100 to avoid decimals. Thus the meaning of all averages for 
the Phonemic and Overall variables can be understood in terms of 
the standards described in the section on scoring. As indicated 
on page 31# the scores could run from 2 to 14# a score of 2 be- 
ing equivalent to a rating of 0.0 by both raters and a score of 
14 to a rating of 6.0 by both raters. According to the scoiing 
standards described on page 30# scores on the per item times 100 
scale have the following meaning: 

1400 Almost native French 

1000 Not correct# but more French than American 
600 Almost wholly American 
200 Badly garbled or wholly American 

Nearly all the averages that will be reported in this sec- 
tion lie between 600 and 1000 and represent some degree of prog- 
ress between "almost wholly American" to "Not correct# but more 
French than American". This# of course# is what might be ex- 
pected for a group of beginners. 

It will be useful to have some standard for the 

sunount of real difference a given score difference between tests 
or treatment groups represents. The most meaningful unit of 
this sort is the standard deviation. It was found that the mean 
standard deviation of an experimental group for all experiments 
and all tests was 77 in terms of the per item times 100 scale 
and that there was no appreciable difference between the mean 
SD*s of the aptitude tests and the criterion tests or between 
the mean SD*s of the Phonemic and the Overall variables. It was 
therefore decided to use 80 as a "standard unit" in estimating 
the importance of all changes from test to test and all differen- 
ces between treatment or experiment groups. This unit of 80 per 
item times 100 points represents one fifth of the distance be- 
tween "Almost wholly American" and "Not correct# but more French 
than American." To facilitate thinking in terms of this unit# 
indices in all charts are shown in simple fractions of the unit. 

Table IV shows a series of abbreviations frequently used in 
reporting means. 

The bulk of the results of statistical analysis for this 
section has been placed in Appendix IV. Table A shows a series 
of reliability coefficients computed for the Aptitude Test. The 
split— halves were secured by correlating odd-even items# the odd 
items were those used later in the training series. The two 




39 



TABLE IV 

FREQUENT ABBREVIATIONS USED IN TABLES AND CHARTS. 



(All scores shown in the tables are mean scores or adjusted 
mean scores. Units are always per item x 100 as described 

in the text.) 

Aps Aptitude Test (Aptitude-Criterion test administered at 
th^beginning of the experiments) 

Prs Pre— te sts (All six Pre— tests administered at the beginning 

of each laboratory training session) 

Po: Post-tests (All six Post-tests administered at the end 

of each laboratory training session) 

PTs Pinal Traine d (Sum of items in Aptitude-Criterion test 
administered at end of experiment which were used in 

training series.) , * ^ ^ 

py. Pinal Untrained (Sum of items in Aptitude-Crxterion test 
administered at end of each experiment which were not used 
in training series.) 

IPs Inactivated Feedback (Group with Inactivated Feedback 

Activated Feedback (Group with Activated Feedback 
treatment) 

hDs L ong Delay (Group with Long Delay treatment) 

SDs short Delay (Group with Short Delay treatment) 

Ex4i Ex periment 4 (Group in Experiment 4) 

Ex5 Experiment 5 (Group in Experiment 5) 

Bx 7 Experiment 7 (Group in Experiment 7) 

Ph* Phonemic variable, 

OA Overall variable. 

Cos Combined yariable (Overall Phonemic variables added 
• together) . . ' • J 

M: Mean (Mean scores for any group) 

Mms Mean of Means (Mean of any row or column of means) 

Dphj Deviation from Mean of Means, (Deviation of a given mean 
from the mean of a set of means) 

MOmt Mean Deviation from Mean of Means, (Mean of the deviations 
of a set of means from the mean of the set) 



40 



halves had been made closely similar by including exactly two of 
the target phonemes in each half. 

Examination of this table shows a satisfactory level of re- 
liability for both raters combined throughout all three ex^ri- 
ments, although there is some decrease in reliability xn the 
later experiments * especially for the Phonemic variable, prob 
biy resulting from rater fatigue. The Overall ratings 
sistently more reliable than the Phonemic and are as reliable as 
the Phonemic and Overall ratings combined. 

Except for the Phonemic ratings in Experiment 5, rater C is 
consistently more reliable than the other two. Hoi^ver, the cor- 
relations between raters, especially 

tenuation produced by the imperfect reliability of their ratings, 
shows that the raters were rating on the basis of the same con- 
cept of good pronunciation. 

Nevertheless, although the raters agreed with one another 
closely as to the relative standings of the subjects, 
terpreted the instructions differently with respect to the 
chorages of their ratings.! as shown in Table B, Rater S rated 
consistently below Rater C and Rater B rated consistently above 
Rater C. Rater B*s anchorages definitely 

tive to C*s from experiment to experiment. The ^ 

far above the random fluctuations found for the 

of the raters on the odd and even items of the Aptitude Test. 



Because of these differences in anchorages, no interpreta- 
tion can be given to average differences between the two varia-. 
blesf and because of shifts in anchorages from experiment to ex- 
periment, average differences between experimental 
uninterpretable. This does not, however, interfere with compari 
son of differences between the groups in amount of improvement 
over the Aptitude Test shown in their criterion tests. 



Table C in Appendix IV shows the intercorrelations between 
the Phonemic and Overall variables for the various tests. The 
corrections by the Spearman-Brown formula for the criterion 
tests were made because these tests contained only half as many 
items as the Aptitude Test. Even without these correct-.ons, 
there is a definite tendency for the two varied ate to be more 
closely associated with one another in the Post-test and Pinal 
Trained than in the Aptitude Test, except in Experiment 5. 



Table D, Appendix IV, shows intercorrelations from one test 
to another. The lower part of this table shows certain means of 
these correlations which were computed by the s-transformation 
method to bring out the average commonalities for three kinds of 



tests, 



as follows 2 



41 



1. Aptitude Test ; Tests skill in pronouncing sentences before 
any instruction or practice. 

2. Training Tests ? that is, the Pre-tests, Post-tests and Final 

Trained: Test skill in pronouncing sentences after labora- 

tory practice and/or classroom instruction have improved 
their pronunciation. 

3. Final Untrained t Tests skill in pronouncing utterances other 
than"those used in training subsequent to instruction and 
practice on the training utterances. 

The mean correlations summarize the commonalities of the 
Aptitude Test with the Training Tests and the Final Untrained 
and of the Training Tests with each other and with the Final Un- 
trained. Table D shows the mean of all the correlations between 
Aptitude and Training Tests is .64; between Aptitude and Final 
Untrained, .73; among the Training Tests .76; between Training 
Tests and Final Untrained .73. The differences between the Ap- 
titude-Training Test means and the other means are statistically 
significant. The result indicates that the Training Tests were 
measuring some factor not correlated with the Aptitude Tests and 
that the Final Untrained was only partially affected by this fac 

tor. 



Interestingly, the Phonemic Aptitude seems to have been a 
better predictor of later scores than the Overall Aptitude; 
whereas the Overall Training tests were better predictors than 
the Phonemic Training tests. 

The examination of Tables C and D led to the conclusion that 
the relationship between the variables was sufficiently complex^ 
to justify an cinalysis of experimental differences for both vari- 
ables separately. In the outcome, however, both variables pro- 
duced essentially similar experimental results. The data on the 
separate analyses, thereforei, have been placed in Appendix IV, 
and the tables shown in the text are chiefly comparisons of com- 
bined means of the two variables or means derived from combined 
scores . 

Table E, Appendix IV, shows the raw means for both varia- 
bles for all treatment groups in all three experiments and also 
the differences between the Aptitude means and the means of the 
four criterion tests. These data are condensed in Table V and 
Fig. 4 which show the means and differences for the Phonemic 
and Overall variables and the Combined means for the entire popu- 
lation of the three experiments. The only change from test to 
test for the Phonemic and Overall variables that is not statisti- 
cally significant at the .01 level of confidence is the slight 
gain between the Post-test and Pinal Trained for the Overall 
variable. In terms of the standard unit of 80 points there is 



Fig . 4. Mean raw scores « all Replication Experiments combined ^ 
for the Phonemic . Overall and combined variables on the Apti^ 
tude Test . Pre-tests . Post-tests . Final Trained and Final TJn- 
trained . 



TABLE V 

MEANS OF PHONEMIC AND OVERALL VARIABLES AND COMBINED MEANS SUM- 
MING ALL THREE REPLICATION EXPERIMENTS WITH DIFFERENCES BETWEEN 
APTITUDE AND CRITERION TEST SCORES. 



Special abbreviations: DPr, DPo, DFT, DFU; differences between 

Aptitude test and criterion test scores. 





N 


Ap 


Pr 


Po 


FT 


FU 


DPr 


DPo 


DFT 


DFU 


Ph 


83 


632 


742 


788 


757 


734 


110 


156 


125 


102 


QA 


33 


752 


866 


932 


939 


862 


114 


180 


187 


110 


CO 


33 


692 


804 


860 


848 


798 


112 


168 


.. 156 


106 



a gain on the Combined variable of 1.39 units between Aptitude 
and Pre-test, 0.71 units between Pre-test and Post-test, 2.10 
units between Aptitude and Post-test, 1.95 units between Apti- 
tude and Final Trained and 1^33 units between Aptitude and 
Final Untrained. 

To determine whether the large gain from Aptitude to Pre- 
tests should be attributed to the Classroom Sessions which pre- 
ceded the Pre-tests or to the cumulative effects of six days 
practice, a computation was made for each Pre=test day separate 
ly to determine the gain in mean score over the mean score made 
on the same items in the Aptitude Test. The results are shown 
in Table F, Appendix IV. There is a definite tendency for the 
higher gains on the Phonemic variable to have occurred in the 
earlier days of training and a slighter tendency on the Over- 
all variable for the greater gains to have occurred in the la- 



43 



ter days of training. The Combined means show greater gains in 
the earlier days. These day-to-day differences are probably not 
statistically significant, but it appears certain that cumula- 
tive improvement in the training period d^s not account for the 
marked gain on the Pre-tests over the Aptitude Test. 

Table G, Appendix IV, shows the means for the four treat- 
ment conditions for each of the treatments as they were adjusted 
in the course of analyses of covariance for each criterion test 
in each experiment, using the Aptitude Test as the predictor. 

The treatment variance was found to be significant at the .05 
level for only one of these analyses? namely, the Post— test for 
Experiment 7. In a set of 24 analyses, such a result might be 
e 3 <:pected to occur as a random variation. Furthermore, the rank 
order of the treatments varies from experiment to experiment and 
treatment to treatment. This variation is greater between ex- 
p 0 ]f 5 .inents for a given test than between tests within experiments, 
suggesting that random errors of group selection for the various 
treatments may have contributed considerably to the error vari- 
ance. 



Nevertheless , examination of the table reveals a definite 
average superiority for the Activated Feedback condition and 
average inferiority for the Short Delay condition. The average 
rank order for AF is 1.55, for IF 2.25, for LD 2.35, and for 
SD 2.96. 

Table VI shows the general trends in the data of Table G, 
Appendix IV, secured by combining, that is, taking the means of 
the Phonemic and Overall adjusted means. The mean deviations 
are indices of the amount of differentiation among the treatment 
groups produced either by random errors in group selection or by 
real differences in the effects of the treatments. As summarized 
in the right-hand column, where the means of all three experi- 
ments are shown, these differences are greatest for the Post-test, 
next for the Pre-test, then the Final Trained and the Final Un- 
trained. The mean standings of the experimental groups are the 
same for the first three tests, namely first, LD second, IF 
third, and SD fourth. These standings change in the Final Un- 
trained, with IF first, AF second, LD third, and SD fourth. In 
short, the first three tests, that is, the Training tests seem to 
reveal on the average a certain definite pattern of differentia- 
•tion among the treatments, which tends to disappear in the Fxnal 

Untrained. 

The regularity of these average standings from test to test 
is displayed in Fig. 5, which diagrams the data shown in the 
right hand means column of Table VI. 

The rank order of the treatments, however, varies from ex- 
periment to experiment, as indicated in Fig. 6, which shows the 



44 



TABLE VI 

COMBINED MEANS OF ADJUS0?ED PHONEMIC AND OVERALL MEANS OF TREAT- 
MENTS DERIVED FROM ONE-WAY ANALYSIS OF COVARIANCE OF EACH CRI- 
TERION TEST IN EACH EXPERIMENT IN THE REPLICATION EXPERIMENTS 
WITH I'4EANS OF TREATMENT MEANS FOR ALL THREE EXPERIMENTS AND MEANS 
OF TREATMENT MEANS FOR EACH TEST IN EACH EXPERIMENT TOGETHER 
WITH DEVIATIONS AND MEAN DE^i^ATIONS OF TREATMENT MEANS FROM 
I-ffiANS OF TESTS. 







Ex4 




Ex5 




Ex7 












M 


Dm 


M 


Dm 


M 


Dm 


Mm 


Dm 




IP 


812 


+14 


790 


-39 


774 


-7 


792 


-11 




AF 


833 


+35 


862 


433 


800 


+19 


832 


+29 

41 


Pre- 


LD 


781 


-17 


838 


+ 9 


794 


+13 


804 


+ 1 


tests 


SD 


766 


-32 


826 


- 3 


755 


-26 


782 


-21 


Mm 


798 




829 




781 




803 






I4Dm 




24 




21 




16 




15 




IP 


869 


+15 


847 


-43 


844 


+ 8 


853 


- 7 




AF 


897 


+43 


924 


+34 


854 


+18 


892 


+32 


Post- 


LD 


828 


-26 


901 


+11 


859 


+23 


863 


+ 3 


tests 


SD 


822 


-32 


888 


- 2 


788 


-43 


833 


-27 


Mm 


8S4 




890 




836 




860 






MDm 




29 




22 




24 




17 




IP 


864 


+ 8 


817 


-36 


836 


0 


839 


- 9 




AP 


897 


+41 


873 


+20 


830 


- 6 


867 


+19 


Final 


LD 


844 


-12 


857 


+ 4 


857 


+21 


853 


+ 5 


Trained 


SD 


819 


-37 


865 


+12 


820 


-16 


835 


-13 


Mm 


856 




853 




836 




848 






MDm 








13 




11 




11 




IF 


809 


+18 


827 


+ 7 


7S3 


- 2 


806 


+ 7 




AF 


805 


+14 


832 


+12 


779 


- 6 


805 


+ 6 


Final 


LD 


777 


-14 


822 


+ 2 


801 


+16 


800 


+ 1 


Untrained 


SD 


773 


-18 


798 


-22 


777 


- 8 


783 


-16 


Mm 


791 




820 




785 




799 






MDm 




16 




11 




a 




7 







Fig . 5,. Adjusted means of treai^ents , combining all three 
Replication Experiments and Phonemic and Overall variables 
for Pre«»i:eats . Post "tests. Final Trained and Final Untrained . 
( Frcan right, hand colxuvin. Table VI . ) 




Combined means of each treatment in each experiment for each 
test separately. A definite pattern of variation is found be- 
tween the three experiments throughout the- first three tests. 
This pattern disappears in the Pinal Untrained. 

In all the first three criterion tests IP does relatively 
best in Experiment 4, slightly’ less well in Experiment 7, and 
very poorly in Experiment 5. AP does best in Experiment 4, next 
best ih Experiment 5, and the poorest AP performance is in Ex- 
periment 7.' 'LD's relative positions are the reverse of AP, 
poorest in Experiment 4, best in Experiment 7. SD is very low 
in Experiments 4 and 7, but hovers around the average of the . 
three groups in Experiment 5. The pattern for each group is 
more marked in the Post-test than in the Pre-test. In the Pi- 
nal Trained, the pattern shows less differentiation from experi- 
ment to experiment for IP, LD, and SD# but the greatest dif- 
ferentiation of all for AP. In the Pinal Untrained AP and LD 
retain their patterns, but the patterns for IP and SD disappear. 
The possible moaning of this peculiar system of regularities 
from test to test will be dealt with in the discussion section. 

As shown in Table VI and Pig. 5, the differentiation be- 
tween treatments is greater for the Pre-tests than for either 
of the Final Tests and nearly as great as for the Post-tests . 
Since the Pre-tests preceded experimentally differentiated 



e 

ERIC 




46 




Fig . j6 . Adjusted means of treatments > combining Phonemic and 
Overall variables , showing changes in treatment standings from 
experiment to experiment for ajLl four criterion tests . ( Prom 
Table VI.) 



practice on each day*s work, the sources of this variability 
could be of only three kinds: (1) random group variation (2) 

differences in conditions under which the groups took the Pre- 
tests (3) generalization to the latter days of Pre-testing of 
differential effects resulting from the earlier days. If this 
third possibility were true, the treatments would be relative- 
ly alike during the earlier days of Pre-testing and would show 
great differentiation on i:he latter days. To test the third 
possibility, mean scores for each of the four treatments on the 
first three Pre-tests were compared with mean scores on the 
last three, as shown in Table H, Appendix IV. The results com- 
pletely disconfirm the third supposition, since the treatment 



o 



47 



TABLE VII 

ADJUSTED COMBINED SCORE MEANS OP TREATMENTS AND EXPERIMENTS 
FOR EACH CRITERION TEST OP THE REPLICATION EXPERIMENTS DERIVED 
PROM TWO-WAY ANALYSIS OP COVARIANCE WITH MEANS OF MEANS AND 
DEVIATIONS PROM MEANS OP MEANS. 







Pr 

M 


Dm 


Po 

M 


Dm 


FT 

M 


Dm 


FU 

M 


Dm 




IP 


789 


-18 


848 


-is 


834 


-15 


801 


+1 


Treat- 


AP 


829 


+22 


890 


+24 


867 


+20 


807 


+7 


ments 


LD 


802 


- 5 


857 


- 6 


850 


0 


792 


-8 




Mm 


807 




866 




850 




800 






Ex4 


808 


+ 1 


858 


- 8 


868 


+18 


798 


-2 


Experi- 


Ex5 


798 


- 9 


857 


- 9 


• 815 


-35 


784 


-16 


ments 


EX7 


814 


+ 7 


884 


+18 


868 


+18 


818 


+18 




Mm 


807 




866 




350 




800 





most: superior on the Pre-tests , the AP # is the only one on which 
there is a combined lower average on the last three days than 
on the first three. 

The combining of adjusted means in Table VI is, of course, 
a statistically imprecise procedure. It was undertaken to bring 
out in condensed approximate form the relationships revealed 
through the adjusted means in Table G, Appendix IV, as a guide 
to further, more precise analysis. On the basis of this explora- 
tory analysis, it was decided to drop the SD treatment from fur- 
ther analyses, since its inferiority might be explained in terms 
of the inferior sound quality of the SD machines as noted in the 
section on ecruipment. Inclusion of the SD treatment in further 
analyses mighS introduce spurious findings of statistical sig- 
nificance based on an experimental error rather than a true ex- 
perimental effect. 

Two-way analyses of covariance were made for the Phonemic 
variable, the Overall variable and the Overall and Phonemic com- 
bined. In the latter analyses the OA and Ph scores were summed 
for the criterion tests, whereas they were used separately to 
produce a multiple regression prediction with the Aptitude Test* 

The results of these analyses are shown in Table I, Appen- 
dix IV, for the Phonemic and Overall variables and in Table 
for the combined variable. The results of all analyses are vir- 
tually similar as far as treatment variances are concerned. For 
the first three tests, with the exception of the Pinal Trained 
for the Phonemic variable, the treatment variances are larger 
than the interaction variances. These interaction variances are 
produced by the variation of the treatment standings from experi- 
ment to experiment as shown in Pig. 6. 



48 




Pi£. 1- Deviations froip the mean of all three of adjusted 
means^for* the combined variable , Top ; Deviations of trea_t- 
ment means . Bottom : Deviations of means of experiment 

groups « ( From Table VII.) 

« • 



49 



Hence it appears that throughout all three experiments ex- 
periment-to-experiment fluctuations in the standing of the treat- 
ments are less significant than their average differences through- 
out the experiments. The absence of substantial treatment vari- 
ance in the Final Trained on the Phonemic suggests that whatever 
effect produced these differences generalized rather weakly to 
the Phonemic variable in the Final Trained. This failure to 
generalize may be related to the marked decrement in Phonemic 
scores from the Post-test to the Final Trained shown in Fig. 4. 

The analyses show a marginal degree of statistical signi- 
ficance for treatment differences in the Pre-tests on the Phone- 
mic and in the Post-tests on the Overall and combined variables. 
The data fulfill the assumption of homogeneous regression which 
underlies the analysis of covariance, but the Hartley test for 
homogeneity of adjusted variances in the 

ficant differences. This casts some further uncertainty on t 
finding of significant differences between treatments. 

Table J, appendix IV, shows the distribution of adjusted 
means derived from the analyses reported in Table I. Table VII 
shows the distribution of adjusted means for the Combined varia- 
ble, together with the deviations from the mean of all three. 

Fig. 7 shows the changes in these deviations fron> test to tes . 

The consistency of results for treatment variations on the 
Pre-tests, Post-tests and Pinal Untrained can hardly be interpreted 
in any other way than to assume that the differences are p^uce 
by the same factor or factors, either by experimentally 
effects or effects produced by random group selection. The sme- 
what marginal determinations of statistical 

the interpretation that the effect was experimentally produced. 

For the combined variable, the range of differences between 

treatments in "standard units" of 80 is .SO, • , 
for the Pre-tests, Post-tests, Final Trained and Pinal Untrained 
respectively. The difference between the IP and ^ 
on the Post-testsis one-fourth of the mean gam of 168 for all 
groups, including SD, between Aptitude and Post-tests as sh^n 
in Table V. For the Pre-tests, this difference is 36 percent of 

the mean gain. 

The results reported in Table VIII were secured in order to 
determine whether the superiority displayed by the AP condition 
on the Final Test might be attributable to an enhanced capacity 
for dealing with the first half of the test, which was adminis- 
tered with activated headphones. The table shows that this was 
the case. If the Pinal Test had been given entirely in the In- 
activated condition, the AP group would have done no better than 
the others on the Pinal Trained and would have been somewhat in- 

ferior on the Final Untrained. 



o 



50 



TABLE VIII 



MEAN COMBINED OVERALL AND PHONEMIC GAINS OVER THE APTITUDE 
TEST FOR THE ACTIVATED AND INACTIVATED CONDITIONS IN THE PINAL 
TRAINED AND PINAL UNTRAINED TEST FOR ALL THREE REPLICATION 
EXPERIMENTS COMBINED COMPARING THE ACTIVATED TREATMENT WITH 
the INACTIVATED AND LONG DELAY TREATMENTS COMBINED, WITH DIF- 
FERENCES BETVJEEN THESE MEAN GAINS. 




Final Trained 



Activated 


Inactivated 


AP 211 


153 


IP + LD 158 


146 


Diff. 53 


7 



Pinal Untrained 



Activated 


Inactivated 


132 


105 


103 


117 


29 


-12 



Note: Aptitude score for AF is 679, for IP and LD comba.ned, 

693- Scores adjusted for regression would decrease the raw 

gain advantage of AP*. 



In the analyses of covariance, (Tables 1 and K, Appendix 
IV) significant differences between experiment groups? that is, 
college students, senior high school students, and 
school students are found only for the Pinal Trained test. As 
shown in Table J and also Table VIII and Pig. 7, the college 
group is considerably below the two high school groups. For the 
Combined variable, this difference is .65 of a "standard unit 
of 80. It is slightly over one third of the mean gain froia Apti 
tude to Pinal Trained for all groups as shown in Table IV. 



Discussion 

The basic objective of the Replication Experiments, as 
stated in the section on the analysis of the problem, was to de- 
termine the relative effectiveness of the four trea^ents for 
learning to pronounce. The only one of the four criterion tests 
on which a finding of differences between treatments could have 
shown a genuine superiority of one treatment condition over 
another was the Pinal Untrained. The other tests could show only 
the effectiveness of the treatments in improving the pronuncia- 
tion of specifically practiced sentences, measuring the improve- 
ment in terms of scores adjusted on the basis of the Aptitude 

Test. 



51 



sueeificallv, the Pre-tests were a measure of the improve- 
ment produced by' a ClassroOTi Session in which the 
instructed in the vocal postures for French 

which they heard and saw the instructor ^ ^he 

sentences and imitated his pronunciation. The ^ 

pre-tests to show greater improvement over p 

the later days of instruction than in the earlier days (Table F, 
areendix IV) indicates that practically all the improvement 
t^ Pre-tests was derived from this Classroom Instruction. The 
Se-t^t^weL! therefore, measures of insproyemej^ from 
tion The Post-tests were measures of immediate jjBBr gyemen t 

Th. «n.l Tr.ln«. “f 

on the specific utterances that had been practiced in the previ 
ous six ^ys. Only the Final entrained was a measure of agnera 

lized learning to pronounce . 

It should be noted that there was no ®*P®tim®"tal differen- 
tiation of treatment prior to the Pre-tests. All , 

groups had exactly the same experience in the ®®®® 

Lnce, the differences found in the scores of the treatment 

groups on the Pre-tests (Tables ^^®® ^l£j,ed 

not be attributed to aijy trug. treatment e^ ITf erioritv^ the 
li^rning or even ^ specific improvement, ^e 
SD subjects could be attributed to the slight deficiencies 

tLir equipment. The superiority of the AF group ^® ests) 

+-0 the fact that they tooTc the Pre-tests (and Post^ e ^ ) 

with activated headphones, whereas the of°LD^ovL'*l^ir" 

activated headphones. The slight s"Pei^io*^^*y 
probably best attributed to a random group ®®i®®^^°" 

will be discussed later. Indeed, it is effect! 

possibility that all differences were due to this random effect. 

The Pre-tests could justifiably have ‘=’®®" ®®f 
tude tests relative to the Post-tests Jmorovarnent of 

measure the effect of the treatoents ®. ^XHurSom 

specific utterances. If this had been done, it is obv^us from 

tL consistency of the relative treatment standings fr^ Pre 
tests to Post-tests (Tables VI and VII. ^^^ures 5 and 7 ) that no 
appreciable differences would have been found in ^^® " 

Zlns Of the treatment groups. In short, the experiments 

no suoerioritv or inferiority on the fiart of any of t^^ 

;;;;^nts^ n pricing immediate improvement throusk practice on 

specific utterances, . 

The Pinal Test was administered, like the i^ptitude Test, 
.it,. SiLtta M tta flttt Mi£ jf th. 

and inactivated headphones for the second half. The AF P 

showed no superiority on the shw a rtatis- 

ministered with the inactivated condition but did show a statis 

tically non-significant 

administered with the activated condition iTable VIII) . Th,_ 



o 



52 



retention of auperioritv on practiced utterances tested under 
the activated condition is the only: indication of an^ suReri.-.t . 

mrnJUbmm ' ' '' •• • 

ssi&t |2£. ££ t;re,atment as" t rainin g since 

it is statistically non-significant, no conclusion can be 
ihade ‘Concerning -even this limited advantage to the i\F con- 
dition. 



On the unpracticed sentences in the Pinal Untrained, the ad- 
vantage of the AP treatment was small as to be negligible. 
Indeed, the AP group fell below the combined IP and LD averages 
on the part of the test administered with inactivated headphones 
(Table VIII) . For a laboratory training device to be really su- 
perior for teaching pronunciation, its effect must generalize 
beyond the training situation and beyond the material practiced. 
The unpracticed part of the Pinal Test (Pinal Untrained) provi- 
ded for only a slight amount of generalization — to different, 
but similarly structured, utterances. The only difference found 
was a statistically non-significant tendency to do slightly bet- 
ter under the condition of activation or inactivation that was 
present in practice. 

The low standing of the LD group on the Pinal Untrained, 
when taken into consideration with other data, does indicate a 
possible inferiority of long delayed playback for generalized 
learning. This matter will be discussed further below. 

In spite of these negative findings# much information has 
been gained in the course of the experiment which leads to im- 
portant hypotheses concerning both the measurement of pronuncia- 
tion performance and the processes of learning to pronounce. 

Since the experiment was designed to test the effectiveness of 
the four treatments, the design did not necessarily provide 
properly controlled tests of the "by-product" findings. It will, 
nevertheless, be useful to point them out as guides to further 
research. The "by-product" findings, together with the finding 
of possible inferiority of the LD condition for "generalized 
learning" will be stated as hypotheses, followed by discussion 
referring to the evidence for them. 

Hypothesis 1,: In the initial stages of learning to pronounce 

classroom instru'Siion produces rapid improyemen_t . .@.is Alprpvgr 
ment is large compared with that aacured from, later, laboratory. 
practic a. 



Pach day the subjects were instructed in the classroom, 
largely on the general muscular set required for pronouncing 
French and the general differences between French and American 
pronunciation. The sentence material for the day was repeated 
only a few times. Yet the improvement from the Aptitude OPest to 
the Pre-tests was twice that from the Pre-tests to the Post- 
tests. Since improvement on the Pre-tests was no greater for 



o 

ERIC 



the latter days of the training period than for the earlier days, 
it cannot be attributed simply to a generalized improvement as 
the training period progressed. 

The chief objection to taking this result at face value is 
that, in the course of about a half hour of training (fifteen 
minutes in the classroom, eighteen minutes in the laboratory) it 
might be expected that early training would achieve the greater 
quantitative advance. There is need for experimentation that 
examines the effects of classroom instruction and laboratory 
training under directly comparable circumstances. 

Hypothesis 2 . The F re-t ests, Post«»te st_s and Final Tyained[ 
measured a capacity for improvement in pronouncing: specific 
terances which was not m easur ed by, the i^Ptitude Tesjt. iSSr 

provement factor was ra ndomly di s tr i bu ted among, the, treatment 
groups from experiment to e xperimen t and brought abou^ random 
variations in thei r adjusted means from experiment ^ experiment,. 

In the discussion of Table D, Appendix IV in the section on 
results (p.41 ) it is pointed out that the correlations between 
the Aptitude Test and the Pinal Untrained are higher than those 
between Aptitude and Pre~tests, Post-tests and Final Trained. 

The obvious difference between the Aptitude and the Pinal Un- 
trained on the one hand and the Pre-tests, Post-tests and Final 
Trained on the other is that the subjects had no opportunity to 
improve their pronunciation on specific utterances on the first 
two, whereas laboratory practice and/or classroom instruction in- 
tervened to improve pronunciation on the last three. A 
improvement factor may be postulated which was measured by the 
tests directly affected by training but not by the A.ptitude Test. 
This factor might actually be the converse of a capacity for 
quick readiness to do well on the A.ptitude Test or it might be 
a factor of motivation or ability leading to superior performance 

after practice . 

In any case, the improvement factor would be randomly dis- 
tributed in varying amounts in the treatment groups and would not 
be accounted for in adjusting the means* hence, treatment groups 
high in the factor would receive spuriously high means and vice, 

versa . 

Random variations in this factor would account for the vari- 
ations from experiment to experiment on the part of treatment 
groups exhibited in Table VI and Fig. 6. Thus, relative to one 
another, the IP subjects in the fourth and seventh experiments 
were *'good improvers", whereas in the fifth experiment they were 
very "poor improvers". 

The Aptitude Test was selected as putatively the best pre- 



54 



dictor of later performance because it constituted an exact 
work sample of the material that was to be practiced. It would 
app©ar, on the basis of results » that xts failure to predict the 
full potential for improvement with practice calls for a dif- 
ferent kind of work sample » one in which the utterances selec- 
ted for scoring occur at a later stage of practice on a given 
sentence and probably one in which there has been some class- 
room instruction on how to pronounce. Such a test would be es- 
sentially equivalent to the "training criterion tests" of Table 
D, Appendix IV. Although these tests did not predict the 
"generalized learning" criterion (that is# the Pinal Untrained) 
better than the Aptitude Tests, it should be realized that not 
much generalized learning appears to have occurred during the 
six days of Training Sessions. The scores on the Final Un- 
trained do not reach the level of those on the Pre-tests • ^ It 
might be expected that over a course of several months training, 
the "improvement factor" would contribute considerably to genera- 
lized learning, and an aptitude test containing the "improvement 
factor" would prove a better predictor; that is, a better measure 
of aptitude for generalized learning than the test actually em- 
ployed in these experiments. 

Hypothesis 3 . The Overall variable alon e, probably consti- 
t utes an adequate measure of pronunciation ability synd hns, 
advantage o^ b eing more economical than the combined Overall^ 
Phonemic variables . 

The experimental results were so similar for the Overall 
and Phonemic variables and the two variables correlate so close- 
ly that there seems to have been little need to employ both vari- 
ables for any other purpose than mutual corroboration. 

Half the time spent in scoring student's utterances was 
spent on the Phonemic variable. Although this variable was a 
better predictor of criterion tests than the Overall (Table D, 
Appendix IV) , the Overall after training predicts the Pinal Un- 
trained better than the Phonemic. 

It seems reasonable to assume that after several months of 
training the Overall variable, tested after adequate instruction 
and practice, would show a definite superiority to the Phonemic 
as a predictor of generalized learning. It has the advantage 
of greater reliability, probably because it offers the raters a 
larger sampling of a subject's pronunciation skill for each 
judgment. It also has the advantage of testing intonation as 
well as accuracy in pronouncing single phonemes. 

Hypothesis 4 . Pronunciation performance is better under 
conditions of activated feedback than under conditions, of, inac- 
tivated feedback . However, there is no evidence, that this su- 
perioritv results in a retained superiority in pronunGiatip_n in 



o 



55 



other situations . There is some evidence that subjects who, have 
practiced with activated feedback are superior in performance to^ 
subjects practiced with inactivated feedback when both are work- 
ing with the activated feedback condition . 

This hypothesis has already been discussed in the introduc- 
tory part of this section. Its validity rests upon a decision 
as to whether the consistent superiority of the i\F condition for 
the Pre-tests > Post-tests and Final Trained on both the Phonemic 
and Overall variables was due to random selection for the "im- 
provement factor" or to the fact that the TiF condition provides 
superior cues for guiding pronunciation. The consistency of the 
results from test to test and for both variables precludes the 
possibility that the differences between treatments could have 
been produced by random errors in measurement, but it is not be- 
yond the range of probability that the chances of selection could 
have led to a series of three AP groups relatively high in im- 
provement capacity for each of the three experiments. 

The analyses of covariance, shown in Tables I, K, and L, 
Appendix IV, are designed to put this hypothesis of random group 
selection to the test. The rejection of that hypothesis rests 
upon the P<]^ .05 level of significance found in four of the analy- 
ses. The fact that the data did not meet the test of homogeneity 
of variance could mean that the P*s are either too high or too 
low, and makes the full acceptance of the reality of better per- 
formance with AP somewhat more shaky than the rather low confi- 
dence level of P< .05 already renders it. However, the evidence 
strongly favors the acceptance rather than the rejection of the 
hypothesis of a true advantage to the AP condition in controlling 
pronunciation performance. 

It might be supposed that a more direct measure of the dif- 
ferences between the AP and IP conditions for performance could 
be obtained by finding the difference on the Aptitude Test, where 
the first half was administered with activated headphones and the 
second with inactivated headphones. The difficulty is that this 
test was not set up for an experimental comparison. The sentences 
in the first part might have been easier or harder than those in 
the second part. A priori the time arrangements might be expected 
to favor the inactivated portion, since it was administered af- 
ter an 8-minute rest which would eliminate fatigue effects and 
it thus had the advantage of practice effects which might be con- 
siderable with subjects who had never had any previous practice. 

In spite of the probable difficulties in interpretation, 
the data were analysed. They are not reported in the results 
section because they were found to be uninterpretable- The ac- 
tivated half of the Aptitude Test yielded a combined score iL6 
points higher than the inactivated part, a difference significant 
at P<.001. However, the sentences selected for training from 
the activated part of the Aptitude Test yielded combined 



56 



scores 13 points highotr on the Pre-tests and 20 points higher on 
the Post-tests than those selected from the inactivated part. 
Hence, the superiority of the activated section of the J^ptitude 
Test might be due to the fact that the sentences were easier , 
and no conclusion can be made. 

Hypothesis 5 . Long delay playback is inferior to a com- 
parably administered non-plavback condition in effectuating 
generalized learning . 

The evidence for this hypothesis is indirect and is not 
statistically significant, but it constitutes the only approach 
to a positive conclusion that the experiments yield with regard 
to the effectiveness for learning of the playback conditions. 

(The complete indeterminateness of the findings with regard to 
SD will be discussed in connection with Experiment 6.) 

The hypothesis is based on the assumption that the consis- 
tent superiority of the LD treatment over the IP treatrent for 
the Pre-tests, Post-tests and Final Trained was a result of ran- 
dom group selection on the "improvement factor." In the first 
place, there was no experimentally varied condition that can ac- 
count for the superiority of LD on the Pre-tests . Both groups 
took the Pre-tests with inactivated headphones prior to any ex- 
perimentally differentiated practice on the utterances. To be 
sure, the later Pre-tests were taken after experimental dif- 
ferentiation, but the superiority of LD on Pre-tests was not the 
effect of learning over the six days of practice as evidenced by 
the fact that the LD group increased its Pre-test score for the 
last three days over the first three days no more than the IF 
group did (Table H, Appendix IV) . 

Furthermore, reference to Fig. 6 shows no consistent su- 
periority of the LD group from experiment to experiment. If 
only experiments 4 and 7 had been performed, LD would have 
averaged lower than IF. The extremely low standing of the IF 
group in Experiment 5 alone accounts for the average superiority 
of LD over IF for all three experiments. It may be positively 
concluded that no result of the experiments indicates any su- 
periority for long delay playback over non-playback conditions. 

Assuming that the LD superiority on the Pre-tests, Post- 
tests and Final Trained was a function of chance superiority of 
the LD groups on the "improvement factor", the fact that the ad- 
justed LD mean falls below the adjusted IF mean on the Pinal 
Untrained, offers indicative evidence of inferiority on the part 
of LD practice to generalize to unpracticed material. 

If the Pre-tests and Post-tests had been used to predict 
achievement on the Pinal Untrained it is evident from examina- 
tion of Pig. 7 that the adjusted means of AP and LD would have 



o 



1 



57 



fallen considerably below the adjusted mean for IF. J**® 

on the AP mean could be considered spurious because 

ble advantage to the AP condition on the ®_ ' 

since LD had no such advantage, the low LD standing o^ld be at 

tribut»d to a true failure of the LD treatment to produce as much 

Sneraied Laming as the IP. It is highly -Pf 

that the difference would be statistically significant, be 

assumption of inferiority on the part of t’?® . 

generalized learning must be viewed as definitely hypothetical. 



HvDothesis 6 . Whatever special difficulty, o ld gr stud^nt a 
experl eLe in learning to pronou nce is m^ ^ functio n of 
lity 'to retain ' good pronunciation than inability to achieve i^. 



This hypothesis is based on the marked and statistically 
significant difference between the college group and ^^® ^ 
school groups on the Final Trained (Tables I, J, and K, Append! 
IV, Table VII, and Pig. 7). Examination of Table E, ^PP®"^^^ 
reveals that the college group received higher scores ^® 
Post-tests than did either of the ether groups, J®®''®®!® 

more on the Phonemic scores in the Pinal Trained an<i.also re 
grossed on the Overall whereas the other two groups improved. 
Le combined raw score for the high school groups on the Post- 
testswas 345, on the Pinal Trained 846. The college gro p 
scored 890 on the Post-tests and 853 on the Final Trained. 



The fact that the significance of the difference between the 

high school and college groups lies at the ^^^^aroL 

C<^ined variable leaves little doubt that the °?^^®^® ^roup 
really did retain less of the improvement it achieved in 

Training Sessions. 



Thp co3 3 group scored higher on the Carroll-Sapon and 

the Pinal Trained, this advantage disappeared. ^ ® 

expected that over the course of a few months teaching, the 
high school groups, with better retention of 

they made as a result of instruction and practice, would gradual 
ly outstrip the initially more able 'college group. ®^^®® 

wLld, of course, account for the common observation ®^J;®" 3 «agc 
teachers that younger students learn to pronounce more readi y 

than their elders. 




Note* The term "anchorage" is based on the 

iSiaities of pronunciation are “anchored to the standard 

tions on the rating scale in the judgments of 

mediate qualities are rated on the degree to which they approxi 
mate these anchorage points. A given quality for rater B bad 
consistently higher anchorage point on the scale than for rater 
C, and rater B's anchorages became higher, relative to C s, a 
the work of rating progressed. 



58 



The Continuation Experiir.ent (E xperimen t 6,) 



Results 

In Experiment 6» the subjects were trained on both the 
trained and untrained sentences that the same group had been 
tested on in Experiment 5. Since each group of sentences had 
received different experimental manipulation# the results on the 
Experiment 5 trained sentences were analyzed separately from the 
results on the Experiment 5 untrained sentences. 

Since in the analysis of the Replication Experiments no 
really important differences between the Phonemic and Overall 
variables appeared# this report on the Continuation Experiment 
will be centered on the Combined scores as the probably most re- 
liable index of performance. However# to show the irregularities 
in the means attained by individual treatment groups from test to 
test# the raw score means are given for both Phonemic and Overall 
in Table G# Appendix IV. The combined means are shown in Table 
IX. The irregularities appear to be quite uninterpre table and 
are probably due largely to random variations in scoring or the 
day-to-day reactions of the subjects. Pig* 8 displays the data 
on the bottom line of Table IX. The changes in mean Combined 
scores from test to test are shown separately for the sentences 
on which the subjects were trained in 5 and those on which they 
were not trained. The most remarkable result is the failure of 
the group ever to attain the proficiency in Experiment 6 that it 
attained on the Post-tests in Experiment 5. The difference be- 
tween the Post-tests in Experiment 5# and the Post-tests on the 
same items in Experiment 6 is 25 units# which is significant at 
the P<.01 level (t “ 2.90# df » 26). 

All mean differences between treatment groups in Experiment 
6 for both Trained-in-5 and Untrained-in-5 sentences were tested 
for significance for the Phonemic and Overall variables separate- 
ly and also for the Combined variable. None were found signifi- 
cant. To show similarities or discrepancies in the standings of 
the treatments from Experiment 5 to Experiment 6# the means of 
the adjusted Phonemic and Overall scores by treatment are shown 
for both experiments in Table X. (Adjusted Combined scores are 
not used because they were not computed for E>«periment 5) . Mean 
deviations of the treatment means from the mean of all four are 
shown to indicate the greater variability between treatments in 
the Pre- and Post-tests of Experiment 5. 

The deviations of the treatment means of Table X from the 
mean of all four are portrayed in Fig. 9. Major consistencies 
are the first rank for AF on Pre-tests, Post-tests and Pinal 
tests in both experiments and the very low standing of IP on all 
but the Pinal Untrained of Experiment 5 and the Introductory and 
Pre-tests of the Untrained-in-5 sentences in Experiment 6, where 
SD ranks fourth. Otherwise# SD ranks relatively higher in these 
two experiments than in Experiments 4 and 7 • (Compare Table VI 
and Pig. 6) . 



o 



59 




Fig. S o Means of “the college group_ on the 
all criterion tests in Experiments Five, and. Si^L* 
Sentences used for training in Experiment Five., 
Sentences not used for training: in gx.EQy,f^9a^ IliSi®.- 
lower row # Table IX) • 



TABLE .IX 



COMBINED MEANS OP PHONEMIC AND OVERALL MEANS OP TREATMENTS 
FOR ALL TESTS IN EXPERIMENT FIVE, THE PART OF EXPERIMENT SIX 
TRAINED IN EXPERIMENT FIVE# AND THE PART OP EXPERIMENT SIX 
NOT TRAINED IN EXPERIMENT FIVE WITH MEANS OP THE TREATMENT 

MEANS. 




Notes In Experiment Six "In'’ signifies Introductory Test and 
"P” Pinal Test. 



Experiment 5 



Experiment 6 



Experiment 6 
Untrained in 5 





Ap 


Pr 


Po 


FT 


PU 


In 




Pp 


P 


In 


Pr 


Po 


P 


if' 


745 


797 


’S52 


324 


834 


800 


819 


853 


838 


‘808 


'820 


830 


829 


AF 


713 


852 


915 


864 


819 


839 


858 


882 


362 


804 


824 


867 


860 


LD 


732 


837 


900 


856 


821 


855 


858 


864 


865 


825 


827 


835 


849 


SD 


743 


833 


843 


369 


804 


839 


862 


861, 


865 


792 


802 


840 


853 


Mm 


'733 


830 


636 


853 


820 


833 


849 


865 


858 


807 


818 


843 









60 



TABLE X 



COMBINED MEANS OP ADJUSTED PHONEMIC AND OVERALL MEANS FOR ALL 
CRITERION TESTS IN EXPERIMENTS FIVE AND SIX, MEANS OF PARTS 
TRAINED AND NOT TRAINED IN FIVE SHOWN SEPARATELY, WtETH MEAN 
DEVIATIONS OF THE TREATMENT MEANS FROM THE MEAN OP ALL FOUR. 



33s=a^jacaaBB5B Mr:,.ti'iMH* I. ,u.i — irr j 

Notes In Experiment Six "In" signifies Introductory Test 
and ”F", Pinal Test. 



Experiment 5 



Experiment 6 Experiment 6 

Trained in 5 Untrained in 5 



Pr Po FT FU In Pr Po P 



In Pr Po F 



IP 791 836 818 827 
AP 863 925 874 832 
LD 839 901 857 822 
SD 827 888 866 798 
MD 21 26 18 11 



795 814 843 832 
846 867 890 874 
855 858 863 865 
837 859 858 862 
19 18 13 13 



798 812 824 823 
825 836 878 870 
827 828 836 850 
783 796 835 848 
18 14 18 12 



To determine v;hether the high standing of the AF group on 
the Pinal Test of Experiment 6 was produced by its superiority 
on the activated part of the test, the means for the AP group on 
the activated and inactivated parts were compared with the 
for the other three groups combined. The AF group scored 872 on 
the activated section and 850 on the inactivated. The other three 
groups scored 837 on the activated and 862 on the inactivated. 
Hence, as in the Replication Experiments, the superiority of the 
AP group on the Pinal Tost is almost entirely a function of its 
superiority on the activated part of the test. 



Discussion 



Experiment 6 was deliberately designed to be purely explora- 
tory. The experimenters wislied to see what would happen if train- 
ing were continued for another six days without classroom in- 
struction and what relationship would appear between the materi- 
al already p3:acticed in Experiment 5 and the material on which 
the group had not been trained in Experiment 5. The aim was not 
so much to obtain definitive results as to secure information on 
the basis of which hypotheses could be formed. 

It should be remembered that the college group employed in 
this experiment was poor at retaining the skills it had achieved 
in the Post-tests. If one of the high school groups had per- 
formed in this Continuation Experiment, much more improvement 



+ 20 « 



+20" 








Fla* 9 , jPeviationa of adjusted treatment means from means 
of all four on criterion teats * yop s Tests in Experiment 
Five * Middle t Testa ?jacperiment Six # sentences used in 
training in Experiment FlAje * Bottom ? Teats in Experiment 
Six # sentences not used in training in . Experiment Five * 
( From Table X) • 



62 



on the basis of what had already been learned wight have occurred. 

Study of Fig. 8 shows that a considerable general improve- 
went in pronunciation occurred during the six days of training 
in Experiment 5. The scores on the untrained material on the 
Final Test of Experiment 5 are almost as high as those achieved 
on the Pre-tests in that experiment. In the course of the week 
that intervened between Final Test in 5 and the Introductory 
Test in 6, only a slight decrement occurred. Scores on the Pre- 
tests, Post-tests and Pinal show that training on the formerly 
untrained items brought about gradual improvement, but on these 
items the subjects never reach even the level that they attained 
on the Pinal Trained in 5. 

On the utterances which were practiced in Experiment 5, the 
subjects maintained a somewhat higher average. Their scores on 
these utterances dropped below the level of the Experiment 5 
Pre-tests in the Introductory Test of 6 and on the Final Test 
reached only the level of the Final Test in Experiment 5. On the 
Post-tests, they failed markedly to reach the level of the Post- 
tests in Experiment 5. 

In short, a second six days of training not only failed to 
produce improvement over the level attained in Exp'Jsriment 5, it 
actually failed to achieve the level in the Post-tests that was 
achieved in Experiment 5. The possibility of a downward shift 
in the anchorages of the raters cannot be excluded, but it seems 
unlikely, since the chief change in anchorages in the course of 
the Experiments would appear to be an upward shift on the part 
of Rater B. (See Table B, Appendix IV) . Actually, in Experi- 
ment 6, the Phonemic scores show even less, improvement than the 
combined scores do because there is a greater difference between 
Overall and Phonemic scores in 6 than in 5. This would fit the 
assumption that Rater B's anchorage point rose steadily through- 
out the four Main Experiments and that the scores in Experiment 
6 are spuriously higher than they should be relative to Experi- 
ment 5. 

Another possibility is that the practice of four sentences 
during the laboratory Practice Sessions made it difficult to 
achieve the high level of excellence that could be attained when 
concentrating on only two utterances • 

A third possibility is that the absence of the Classroom 
Session resulted in failure to re-establish the set toward good 
pronunciation and hence made it less possible to improve. The 
subjects may have forgotten the postures for correct French pro- 
nunciation that they had been reminded of daily in Experiment 5, 
and have tended to revert throughout the Practice Seseiion to 
American speech postures. In view of the apparent marked effect 
of the Classroom Session of producing great improvement in the 
Pre-tests over the Aptitude Test it seems reasonable that this 



was at least one of the factors involved. If the absence of 
classroom instruction was a major cause of lack of improvement in 
Experiment 6, it may be concluded that classroom instruction can 
not only bring about improvement, but that it can set the stage 
for effective improvement in laboratory practice. Perhaps its 
most important effect could have been to remind the subjects of 
the vocal posture required for effective French pronunciation* 

In comparing the adjusted means, it should be remembered 
that none of the differences shown are statistically significant. 
Nevertheless, if these differences fit the hypotheses formed on 
the basis of the Replication Experiments, they provide additional 
support for those hypotheses. Such a comparison fits the hy- 
pothesis that the 1\F group received advantage from taking the 
Pre-tests and Post— tests with activated headphones and that this 
advantage carried over to the activated section of the Final Test. 

First let us consider the AP performance on the part trained 
in Experiment 5. (See Fig. 9) On the Final Trained, AF lost 
much of the superiority it had displayed on the Pre-tests and 
Post-tests because it retained superiority only on the activated 
parts. On the Introductory Test of Experiment 6 it had lost still 
more. On the Pre-tests it again takes first place, but is not far 
ahead of LD and SD. This may be because the absence of a Class- 
room Session lessened the degree of advantage to the AF group of 
the superior cues which the activated condition presumably pro- 
vides. On the Post-tests, however, AF is as much superior to the 
next best treatment as it was in Experiment 6. Again on the 
Final Trained, the AF superiority declines, and it has been 
shown that this decline is due to its inferiority on the unacti- 
vated part of the Pinal Test. Practically the same pattern is 
repeated on the material not practiced in Experiment 6, except 
that here the relative superiority of AF on the Post-tests is 
even more marked. The outcome of Experiment 6 thus supports the 
fourth hypothesis derived from the Replication Experiments. 

The relationships betiveen the SD and IP treatments conform 
to the hypotheses that there were group differences in improva- 
bility which were not measured by the Aptitude Test and that the 
SD treatment group was handicapped by inferior sound production 
in their machines. 

Throughout the running of the experiments, the experimenters 
did not realize that what seemed to them to be a relatively 
slight inferiority in the SD machines might be a genuine handi- 
cap. They therefore allowed the subjects to sit where they hap- 
pened to go for the Aptitude Test and assigned them to the spe- 
cial treatment positions on the next day. Assuming that the 
SD equipment was a handicap, chance differences between groups 
in the number of members seated at the SD booths may have re- 
sulted in relatively low Aptitude scores in some treatment 



64 






groups and relatively higher criterion scores, thus contributing 
to making certain groups "better improvers" than others. The 
procedure would, however, tend systematically to lower criterion 
scores relative to aptitude scores for the SD groups, and could 
account for their generally low adjusted means in the Replica** 
tion Experiments. 

In Experiment 5, the SD group performed relatively better 
than it did in the other experiments, whereas the IP performance 
was relatively much poorer except on the Pinal Untrained. (See 
Table VI and Fig. 9). In terms of Hypothesis 2 in the discussion 
of the Replication Experiments, the SD subjects in Experiment 5 
were "good improvers" and the IF subjects "poor improvers". SD 
did well, IP poorly where they had previous classroom instruc- 
tion for the Pre-tests or previous practice on the material for 
the Post-tests and Final Trained. On the Pinal Untrained, with- 
out previous instruction or practice on the material, the handi- 
cap under which the SD group worked reduced it to last place; 
whereas the IP group made a score essentially equivalent to its 
standing on the Aptitude Test. 

In Experiment 6, SD maintained its relatively high position 
near the mean of the group on the sentences that had been trained. 
On the untrained sentences, however, it ranked below IP on the 
Introductory Test and Pre-tests# both of which were presented 
prior to any classroom instruction or training on the material. 

In short the "good improvers" in the SD group do poorly without 
instruction and practice because of their handicap. ^ The poor 
improvers" in the IP group do poorly where opportunity for im- 
provement has been given. 

To sum up, the results of Experiment 6, although hot in 
themselves statistically significant, conform to the following 
assumptions derived from analysis of the Replication Experiments; 

1. Superiority of AP on the Pre— tests and Post— tests in- 
dicates that the AF condition provides superior cues 
for pronunciation performance. 

2. Retained superiority on AP generalizes no further than 
to superiority under the activated condition on ma- 
terial already practiced under that condition. 

3. Certain treatment groups in each experiment displayed 
relatively high or low special ability to improve that 
was not measured by the Aptitude Test. 

4. The SD treatment group was handicapped on the criterion 
tests# presumably because of inferior sound production 
in their machines. 



a 






The Trial Exp er iment 



Results 

The procedure in the Trial Experiment was closely similar 
to that in the Replication Experiments. The results, therefore, 
have been analyzed to further test the hypotheses developed on 
the basis of those experiments and to discover what differences 
might have been produced by minor differences in procedure. 

Table XI is comparable with Table V and Table E, Ap^ndix 
IV, and of the Replication Experiments. There is less difference 
between Phonemic and Overall scores in the Experiment 3 data, 
but, as has already been shown, (Table B, Appendix IV) this may 
well be due to changes in the anchorage points of the raters. 

In any case, both variables show the same pattern of gain and 
loss. 



The subjects in Experiment 4 and 7 were in the same age 
range as those in Experiment 3. These subjects in the Replica- 
tion Experiments gained somewhat more, especially in l^e Pre- 
tests and Post-tests. 

An analysis of covariance for each criterion test was per- 
formed for both the Phonemic and Overall variables wivhout any 
finding of significant differemces. The adjusted means for the 
three treatments and the deviations from the mean of all three 
are shown in Table XII, comparable with Table VI and Table G, 
Appendix IV. None of the treatments in Experiment 3 show any 
consistent superiority. This is emphasized by the very slight 
differences between the means of all tests and both treatments 
in the right hand column of Table XII. 



Discussion 

The lower gains in the Trial Experiment may be attributed 
to the fact that the students worked in the laboratory for three 
days before taking the Aptitude Test, hence they had already made 
some progress. The fact that their standing on the Pre-tests 
and Post— tests was lower relative to the Pinal Test may be attri- 
buted to the fact that the former were not emphasized as being 

tests • 

The failure of the AP treatment to exceed the others on the 
Pre-tests and Post-tests may be due to the fact that the AP feed- 
back was of inferior quality in Experiment 3, as noted in the 
section on equipment. Indeed, this inferiority was more notice- 
able to the experimenters than the inferiority of the SD equip- 
ment in the Replication Experiments. 



o 



66 



TiiBLB XI 

MEANS OP TREATMENTS IN THE TRIAL EXPERIMENT FOR PHONEMIC AND 
OVERALL VARIABLES WITH DIFPEREN^'ES BETWEEN APTITUDE AND CRI- 
TERION SCORES AND WITH MEANS OP BOTH THE MEANS AND THE DIF- 
FERENCES. 







N 


Ap 


Pr 


Po 


FT 


PU 


DPr 


DPo 


DPT 


DPU 






IF 


8 


632 


738 


767 


794 


721 


106 


135 


162 


89 






AF 


8 


648 


708 


756 


786 


726 


60 


108 


138 


78 




Ph 


LD 


8 


666 


732 


767 


782 


721 


66 


101 


116 


55 






Mm 




649 


726 


763 


787 


723 


77 


114 


138 


74 






IP 


8 


697 


732 


777 


861 


'730 


35 


80 


164 


33 






AP 


8 


723 


785 


826 


882 


765 


62 


103 


159 


42 




QA 


LD 


8 


724 


786 


851 


867 


757 


62 


127 


143 


33 






Mm 




715 


768 


818 


870 


751 


53 


103 


155 


36 






CoMm 




682 


747 


791 


829 


737 


65 


109 


147 


55 





TABLE XII 

ADJUSTED MEANS OP TREATMENTS IN THE TRIAL EXPERIMENT FOR PHO 
NEMIC AND OVEBMJj VARIABLES AND COMBINED MEANS FOR EACH CRI- 
TERION TEST WITH MEANS OP MEANS FOR EACH TREATMENT, MEANS OP 
MEANS FOR EACH TEST, AND DEVIATIONS OP TREATMENT MEANS PROM 
THE MEANS OP THE TEST. 







Pr 


Po 


FT 


PU 










M 


D 


M 


D 


M 


D 


M 


D 


Mm 


D 




IP 


751 


+25 


781 


+18 


806 


+19 


731 


+8 


767 


+17 


Ph 


AP 


709 


-17 


756 


- 7 


786 


- 1 


726 


+3 


744 


- 6 


TiD 


718 


- 8 


753 


-10 


770 


-17 


711 


-2 


740 


-10 




Mm 


726 




763 




787 




723 




750 






IP 


744 


-24 


787 


-31 


873 


- 7 


744 


-7 


787 


-15 


OA 


AP 


780 


+12 


821 


+ 3 


877 


+ 7 


758 


+7 


809 


+ 7 


TiD 


781 


+13 


846 


+28 


861 


- 1 


750 


-1 


810 


+ 8 




Mm 


768 




818 




870 




751 




802 






IP 


748 


0 


784 


- 7 


840 


+11 


738 


+1 


777 


+ 'i 


CO 


AP 


745 


- 3 


789 


- 1 


832 


+ 3 


742 


+5 


777 


+ 1 


LD 


750 


+ 2 


800 


+ 9 


816 


-13 


731 


-6 


774 


- 2 




Mm 


743 




791 




829 




737 




776 





In short, the results of the Trial Experiment neither sup 
port nor contradict the findings of the Main Experiments and 
point to little but the fact that relatively minor differences 
in procedure may lead to definite differences in results. As- 
suming superiority in performance for the AF condition, they 
conform to the hypothesis that minor deficiencies in sound 
quality may handicap accuracy of mimicry. 



67 



General Discussion and Suggestions for Further Research 
Activated vs. Inactivated Feedback 

The major positive finding of this series of experiments is 
the probable superiority of the activated headphone condition to 
the inactivated condition for pronunciation performance . Although 
measures of the statistical significance of this finding are mar- 
ginal, the result may easily be confirmed or disconfirmed by a 
relatively simple experiment specifically designed to compare 
performance only without concern for learning- An important re- 
quirement for such an experiment should be that the sound quali- 
ty of both the model's utterance and the activated feedback 
should be high and equally high. 

The failure of the AF groups to display any appreciable 
retained superiority on the Final Test, except for the material 
which they had already practiced and which was presented in the 
test with activated headphones casts considerable doubt on 
whether or not students would actually learn to pronounce better 
with activated headphones. Even if the activated headphone con- 
dition does present better cues for controlling performance, 
there is no ^ priori reason for believing that a long term of 
practice with this condition would result in generalized learn- 
ing of better performance in the normal conversational situation. 
The absence of the cues under which practice took place might 
actually serve as a handicap, since learned skills are notori- 
ously dependent on the cues under which they have been learned. 

The fact that the AF groups did less well than the ^ other groups 
on the part of the Final Untrained administered with inactivated 
headphones in the Replication Experiments and similarly on the 
Final Test in the Continuation Experiment is indicative of this 
possibility. 

As pointed out in the introductory analysis of the problem 
(p.3), activated feedback, with a relatively low gain setting, 
produces an impression similar to ordinary speech conditions. 
Practice with such a setting might be optimal for generalization 
from laboratory practice to the conversational situation. 

The above considerations are, of course, speculative. It 
might be that the condition which produces the best performance 
would, over the course of time, produce the highest degree of 
generalized learning. Probably the best means of testing would 
be to equate three groups in several language classes, let each 
group, over the course of a year, practice solely with one of 
three feedback conditions: (1) inactivated headphones, (2) ac- 

tivated headphones with gain set to reproduce normal speech con- 
ditions, (3) activated headphones with gain set about as high as 
is comfortable. Then compare the groups by testing in a conver- 
sational situation. 





68 



Delayed Playback 



Some language teachers express the opinion that long delay 
playback bores students and is a waste of time* The question- 
naire, however, failed to show greater boredom in the LD group, 
but the technique used in these experiments of having students 
repeat utterances while listening to the playback may have alle- 
viated boredom as well as straying of attention. At any rate, 
the LD groups showed as much improvement on specific sentences 
in the Post-tests and Pinal Trained as did the comparable IF 
groups, although they practiced mimicry only half as much. Pos- 
sibly because of this truncated practice, the LD groups showed 
indications of inferiority in generalized learning on the Pinal 
Untrained. There was certainly no indication of superior learn- 
ing resulting from the use of the highly expcsnsive playback 

equipment. 

There can be little question that listening to the playback 
of one’s own voice is interesting, motivating and possibly in- 
structive; but a prior i it would appear that, if the student 
listens back to'^everything he does, the procedure should become 
quite boring and furthermore divert time from valuable practice. 
An ideal arrangement might be to have a few recording machines 
in a laboratory with which students could test themselves by 
listening back as often as they wished. Such an arrangement 
could be compared experimentally over the course of a year with 
arrangements allowing for no playback. 



Because of the possible handicap of inferior sound produc- 
tion, the experiments provide no certain evidence in regard to 
the effectiveness of the short delay playback condition. Further 
experimental work with this condition should be done, since there 
are a priori reasons for expecting it to be effective as well as 
reasons for doubting its effectiveness as outlined on page 7 in 
the Analysis of the Problem. 



^Professor Rand Morton of the University of Michigan has recently 
been experimenting with a short delay device under the control 
of the student which echoes the student’s utterance in the same 
way that it was echoed in our short delay equipment. In a per- 
sonal communication he states that this device results in more 
rapid achievement of a criterion of student satisfaction with 
pronunciation on frames of sixty utterances each and that the de- 
gree of pronunciation achievement with which students are satis- 
fied is the same with or without the short delay device. 

'[phe students in these comparisons have all had 30 hours of 
training in discrimination of the target language sounds. This 
type of training might well improve ability to take advantage of 
activation as well as short delay playback, and in future studies 
of the effectiveness of laboratory equipment, the 
of various forms of equipment with pre-training in discrimination 

should be investigated. 



69 



Sound Quality 

The compulsions of scheduling which led to the use of the 
short delay equipment before it could be thoroughly tested and 
brought to full equality in sound production with the rest of the 
equipment were certainly unfortunate from the standpoint of ex- 
perimental rigor. But the hint offered as to the 
portance of good sound production for securing the besu possib 
results in teaching pronunciation may have valuable repercussions. 

As mentioned in the section on equipment, the sound quali- 
ties of the equipment in the laboratory were judged by a compe- 
tent outside observer to be of the highest order. The ® 

in the short delay equipment were relatively slight, and the 
short delay sound system was probably superior to the sound 
systems in many laboratories. Yet there are definite 
tions, especially in the fluctuations in the standings 
SD group of "good improvers" and the IP group of poor impj. <:> 
in Experiments 5 and 6, that these relatively slight e-ec .s 
were definitely handicapping. The failure of the AF group to 
demonstrate superior performance in the 
their activated feedback was inferior in ^aiity to 
the model's voice is another indication of this possibility. 

In the opinion of the experimenters, experimental compari- 
sons of performance using the best producible eq»aipment wi- , 

equipment of various degrees of inferiority are definitely cal Iv-d 
for. in their opinion, also, it might be found more profitable 
for language laboratories to expend funds to bring tneir P" 
ment to the highest possible level of acoustical excellence than 
to spend them on devices for activated feedback or playbac . 
Again, however, it should be pointed out that high performance 
is not necessarily a guarantee of superior generalized learning 
and that further research may bring evidence for the value of 
activation and playback that this investigation failed to un- 
cover . 



Classroom Instruction 

The results indicating the importance of classroom instruc- 
tion conform to what is generally known about the acquisition of 
motor skills. Practice, involving self -correction and over- 
learning, is necessary; but instruction in the right way to ef- 
fect a skilled performance and correction 

tions of trained observers are important, especially if the nign 
est degree of skill is to be reached. The classroom instructor 
in these experiments was engaged in the same function as the 
athletic coach when he describes to his proteges 
“form" to employ. His function did not, of course, include the 
function performed in many classrooms of correcting errors. 



70 



If improved techniques of teaching pronunciation are to be 
developed, the best classroom techniques are deserving of study. 

A major emphasis of the highly experienced instructor in this ei:- 
periment was stress on the correct "posture" for speaking French. 
The results of the Continuation Experiment suggests that failure 
to administer this instruction each day before entering the la- 
boratory resulted in lower improvement for the day. 

The function of the classroom instruction can be related 
theoretically to B. P. Skinner's observations on animal learning. 
An animal placed in a Skinner box is reinforced whenever it emits 
a certain response. Soon the reinforced response is regularly 
emitted; that is, it has become "conditioned". However, if a 
response not in the animal's repertory is desired, some method 
of getting the animal to emit the response must be employed be- 
fore reinforcement can have any effect. The Skinnerian method 
is to "shape" the response by first reinforcing any part of it 
or any approximation to it which occurs. V3hen an approximation 
has been conditioned, only the instances of the conditioned re- 
sponse which more closely approximate the desired response are 
reinforced. By this method of successive approximation, the 
desired response is finally shaped and conditioned. 

Self-correction of pronunciation in the laboratory is the 
analogue of reinforcement of emitted responses. The student is 
expected to increase the frequency of responses which most near- 
ly approximate the model . But students who have never spoken 
another language do not have the sounds of the language in their 
repertory. If their responses are not "shaped" in some way, they 
may never emit responses even closely approximating those of the 
target language. Some "shaping" is doubtless achieved in the 
process of self -correction, but among human beings, one of the 
best ways of shaping responses is to tell the learner h^ to pro- 
duce the desired response. This can be vastly more efficient 
than the slow process of shaping animal responses to which what- 
ever shaping is achieved in the laboratory is analogous. 



When a response has been fully conditioned in a Skinner box 
it can be "extinguished" that is, reduced to a minimum frequency 
of occurrence# by withholding reinforcement when it occurs. A 
day later, however, it will appear and require a new extinction. 
The sounds of a native language have been strongly conditioned. 

In the course of a session of practice, they become more-or-less 
extinguished and the sounds of the target language conditioned 
through the process of self -correction. A day later, however, 
there is likely to occur a strong spontaneous recovery of the 
native language sounds. Hence, it would appear that a "re- 
shaping" of the target language sounds immediately preceding 
laboratory practice would be a daily necessity for a considerable 
period of time in the course of learning to pronounce, in order 
to provide for a large number of responses within the range of 



o 



71 



the target language for reinforcement in the course of labora- 
tory practice. 

The failure of the subjects in Experiment 6 to achieve the 
levels in the Post-tests that they did in Experiment 5 conforms 
to the above theoretical considerations. An experiment to test 
the relative advantages of, say thirty-five minutes spent in the 
laboratory without pre-instruction against twenty minutes in the 
laboratory with fifteen minutes pre-instruction could test the 
practical value of the hypothesis that greater progress will be 
made in conditioning target language sounds and extinguishing 
native language sounds if some "shaping" is achieved before each 
laboratory practice. 

Enough has already been said about the problem of generali- 
zation or transfer from the laboratory situaticai to the conver- 
sational situation to point to the desirability of classroom 
practice together with laboratory practice in order to insure 
generalization of the skills achieved in the laboratory to a 
more conversation-like situation. 

None of the above discussion is intended to suggest that 
classroom instruction alone can provide as efficient learning as 
classroom work — and, if possible, individual instruction — 
combined with laboratory practice. The laboratory obviously 
offers the opportunity for more intense and concentrated prac- 
tice than the classroom, making for rapid extinction of native 
speech sounds and conditioning of target language sounds, pro- 
vided the target sounds have been "shaped" so that they occur 
with considerable frequency. 



Standardizing the Testing of Pronunciation 

The need for a well-standardized work sample test of pro- 
nunciation aptitude is fairly obvious, both for purposes of 
equating experimental groups and as a means of sectioning 
classes. Certain requirementsof such a work sample test are ap- 
plicable to standardized tests of achievement as well as of ap- 
titude. The results of the experiment indicate that the problem 
of getting reliable ratings is not a difficult one, although it 
might be necessary to test and/or train individual raters to 
make certain that they were able to rate as reliably as the ra- 
ters in this experiment. The lower reliabilities in Experiments 
5 and 7 than in Experiment 4 suggest that rater fatigue may be 
an important factor to watch in any large scale testing. 

One definition of validity for a test of pronunciation 
should certainly be agreement among recognized authorities as to 
what constitutes good pronunciation. The very high level of 
agreement between raters found in these experiments — as far as 



72 



relative standings are concerned — suggests that this should 
not be a difficult problem. But this may be because the raters 
influenced each other. For a standardized test of pronunciation, 
it should be demonstrable that recognized authorities, scoring 
independently, produce highly correlated results. 

The wide-ranging anchorages of the raters in these experi- 
ments suggest a definite problem for achieving a standardized 
test of pronunciation. Unless the anchorage of different raters 
could be somehow standardized, the scoring of a standardized 
test would give varying averages from rater to rater, no matter 
how well raters agree as to the relative standings of individu- 
als. Standardization would therefore require a standard sample 
of scorings with which raters could practice, comparing their 
own ratings with those of the sample until they found they were 
rating in terms of the anchorage points of the sampler and they 
would need to return to the sample occasionally to make sure 
their anchorages were not "drifting”. In short, standardized 
pronunciation tests would require trained scorers, just as many 
standardized psychological tests do. 

The above considerations apply to all kinds of standardized 
pronunciation tests • With respect to the requirements of a 
standardized aptitude test, the chief finding from the experi- 
ments is the desirability of measuring pronunciation after some 
opportunity for improvement. The high correlations of the Pre- 
test scores with the other "training criterion tests” in Table 
D, Appendix IV suggest that the “shaping” derived from pre-in- 
struction is the major factor producing the condition of im- 
provement over the more naive approach to sheer mimicry which 
characterized our subjects in taking the aptitude test. Actual- 
ly, some experimental work would be required to determine the 
best methods of pre-instruction and the amount of practice on 
sentences requisite to a stable measure of "improved” pronuncia- 
tion. Such a standardization of measures of aptitude would ob- 
viously require some training of instructors as well as raters. 

Finally, our findings strongly suggest the superior effi- 
ciency and economy of confining the scoring to an overall rating 
of whole utterances. Such ratings should be more valid than the 
scoring of single phonemes, since they would involve judgments 
of intonation as well as phonemic accuracy. 



Age Level and Pronunciation Aptitude 

The comparison between the college and the high school 
groups provides an interesting bit of information relative to 
the common observation that the capacity to achieve good pronun- 
ciation decreases with age. Under practice conditions, the col- 
lege group actually performed better than the high school groups, 
but on the Pinal Trained, the measure of retained improvement, 



73 



its scores were markedly low relative to its initial aptitude . 

(The fact that its average raw score was slightly higher than 
the average of the two high school groups is uninterpretable be- 
cause of uncertainty as to the anchorages of the raters.) 

The results of the questionnaire indicated that the college 
students worked conscientiously in the practice sessions# but 
that the work was not as meaningful or enjoyable to them as it 
was for the younger subjects. One might speculate that, although 
they learned to emit good responses, they were not as strongly 
reinforced for doing so. Furthermore, the longer years of prac- 
tice of their own language, or an age -correlated physiological 
change in flexibility of the learning process, might have re- 
sulted in a more massive spontaneous recovery of native language 
responses . 

Perhaps the chief impact of the finding, assuming that it 
is fully confirmed by later research, is the suggestion that 
adults are fundamentally as capable of achieving good pronuncia- 
tion in a target language as are children. Common observation 
suggests that they almost never do . The explanation might be as 
follows: First, adults need to work longer to achieve good pro- 

nunciation, and they are less spontaneously inclined to engage 
in such work than children. Second, adults learn to talk a 
language fluently before they achieve a near approximation to 
native pronunciation, both because they are slower than children 
in learning to pronounce and because they may learn fluent speech 
more rapidly. Third, once the individual begins to speak flu- 
ently, the inferior pronunciation is over-practiced, and it be- 
comes extremely difficult to improve it- 

Wherever it is desirable to develop near-native pronuncia- 
tion in adults beginning the study of a foreign language, the 
feat might be accomplished by requiring them to practice pronun- 
ciation, always with adequate instruction, for a considerable 
time before attempting to engage in active speech or conversa- 
tion. During that time they could be learning the vocabulary 
and structure of the language through reading. But active speech 
might well be delayed until the highest possible pronunciation 
skill has been achieved and well over-learned. 

The foregoing discussion should suggest, at any rate, that 
the common observation that older students do not learn to pro- 
nounce as well as children offers a challenge to investigation of 
the actual factors producing the effect and raises the question 
as to whether methods of teaching can be developed and tested 
that may overcome the special difficulties that older students 
face, whatever they may be. The finding that one of these dif- 
ficulties may foe a handicap in retaining the new skills once 
they are achieved appears to the experimenters to be an impor- 
tant step in this direction. 



o 



74 



Types of Research on Pronunciation 

At several points in this report# emphasis has been placed 
on the fact that no conclusion can be made with regard to the 
efficiency of a method of teaching pronunciation without testing 
it out in actual language courses and with performance in actual 
conversational situations used as the criterion of learning. The 
latter requirement is important, since there is no certainty that 
learning with a device or procedure which works well in the la- 
boratory will generalize effectively to the situations in which 
the language will actually be used. Experiments continuing 
throughout a course are desirable, since the differential effects 
of two laboratory devices may only slowly effect differentiation 
in generalized learning. But they are also desirable because 
minor variations in experimental procedure may easily change the 
apparent outcomes of two experiments, and in the language course, 
the situation resembles most closely the practical situation in 
which the compared methods and devices are to be used. 

A case in point is that motivation in a specialized experi- 
mental situation is usually higher than in day-to-day classroom 
work. Differences in motivation can have major effects on 
learning. High motivation in all experimental groups can mask, 
out differences between treatment methods which might display 
true differences under more relaxed conditions. This is parti- 
cularly true, of course, if one treatment is intrinsically more 
enjoyable or interesting than another. It might well be that^ 
the activated and playback conditions would show superiority in 
actual course work simply because, as indicated in the question- 
naire, students are interested in hearing their own voices. 



Another point at which real treatment differences might be 
masked out by an experiment directed solely to testing pronun- 
ciation, is that such an experiment centers attention on pronun- 
ciation. Where everyone is trying his best to pronounce, the 
results may well differ from those that might arise in a situa- 
tion where attention is more divided. Here again, the activated 
and playback conditions might show a superiority in a course 
situation which would not appear in an experimental situation 
because they might tend to call attention to pronunciation more 
than the inactivated condition. 




The specially designed experiment can never really answer 
the question of what is actually going to happen in the practi- 
cal situation. Its function is to tease out the variables whi cl*. 
may be important in practice. The present series of experiments 
was deliberately designed to measure a number of variables, since 
it seemed to the experimenters that this would be the most eco- 
nomical procedure in the light of the fact that the investigatior 
was opening up a new field. The results raise a number of ques- 



75 



tionc that may be answered in terms of more definitely focussed 
experimental work. The answers to these questions may be useful 
guides to the planning of investigations in the actual language 
teaching situation. 

In addition to the suggestions already made along these 
lines, there is much opportunity for experimental work in the 
area of programming. What kinds of programs arouse the greatest 
spontaneous interest and motivation? Is progress more rapid 
and/or retention better with such programs? Does knowledge of 
meaning and sight of the written word actually handicap mimicry? 
If it does so in some circumstances, does it do so in rll? 

In addition to testing the effectiveness of pre-training in 
phoneme discrimination, the question may be raised as to whether 
pre-training in discrimination of intonations might be effective. 
Furthermore, the question may be raised as to whether a fairly 
long period of listening to a target language prior to the be- 
ginning of mimicry or active speech might not "set the stage" 
for more rapid progress as well as progress leading to a finally 
higher level of proficiency. 

These and many other questions may be put to the test of 
specially arranged experiments and finally to the "pay-off" 
tests of effectiveness in the actual teaching of courses. 



76 



VII. SUMT4ARY 

h series of experiments was designed to test the efficiency 
of the use of four types of language laboratory equipment for 
learning to pronounce French. The four types were (1) inactiva- 
ted headphones (2) activated headphones (3) playback after re- 
cording a practice session, (4) short delay playback immediate- 
ly after the recording of a single utterance. (Hereafter re- 
ferred to as IP, or “inactivated feedback," AP, or "activated 
feedback, " LD, or "long delay playback" and SD, or "short delay 

playback" • ) 

After preliminary experimentation, three Replication Experi- 
ments were performed, each with 7 subjects in each of the treatr 
ment groups. The subjects in the first of these were senior 
high school students; in the second, college students; and in 
the third, junior high school students. (In the college student 
experiment one subject dropped out of the SD group.) With the 
college group, a continuation Experiment was performed to ob- 
S 0 ]fve progress over a longer time than that allocated to each of 
the Replication Experiments. 

None of the subjects had ever studied French. Throughout 
both testing and practice sessions they mimicked a model tape re*^ 
cording. They were not given knowledge of the meaning of utter- 
ances until all experimental work was completed. 

In each of the Replication Experiments exactly the same 
p]fOC 0 ^ures were employed. Each experiment began with an Aptitude 
Test of twenty-four six-syllable sentences. The sentences were 
built up from one-syllable utterances by gradual addition of 
syllables to full six-syllable utterances. The first half of 
the Aptitude Test was administered with activated headphones 
and the second half with inactivated headphones. 

Two of the odd-ninnbered sentences from the Aptitude Test 
were used as training sentences each day for six days. On the 
last day the Aptitude Test was administered again as a final 
criterion test. The twelve sentences used in training were 
scored separately from the twelve not used to comprise the Final 
Trained and the Final Untrained criteria respectively. 

Each Training Session began with a fifteen minute period of 
classroom instruction which was the same for all treatment 
groups. The subjects were instructed in correct vocal postures 
for French pronunciation and the training sentences for the day 
were briefly practiced under instruction. The Classroom Session 
was followed by eighteen minutes of laboratory practice in which 
each of the treatment groups worked with its specific treatment 
condition. Prior to and after each Practice Session, a Pre-test 



and Post-test was administered which tested pronunciation on the 
two sentences for the day exactly as it was tested in the 2Vpti- 
tude Test and Pinal Test. In the Pre-tests and Post-tests, as 
well as the Practice Session, the AP group worked with activated 
headphones and the other groups with inactivated headphones. 

Analysis of results showed that the Pre-tests constituted 
a criterion of improvement over Aptitude produ^d by classroom 
instruction. The Post-tests measured further improvement on the 
specifically trained sentences, the Pinal Trained measured re- 
tention of improvement on specific sentences, and the Pinal Un- 
Trained alone measured genuine generalized learning to pronounce. 

Two variables were measured on each test: (1) PhQnemic ., 

measuring accuracy in pronouncing Prench phonemes particularly 
difficult for American speakers; (2) Oyejrali, measuring overal 
correctness in pronouncing three-syllable and six-syllable utter- 
ances. The statistical analysis revealed no important differen- 
ces in results between these two variables, and much of the re- 
port is made in terms of a Combine_d variable produced by combin- 
ing the scores or means of the two variables; 

The following modifications on the above procedures were in- 
troduced in the Continuation Experiment: (1) Pour sentences were 

practiced every day, so that all the sentences in the Aptitude- 
Criterion test were practiced. (2) The Practice Sessions were 
not preceded by Classroom Sessions. 

A questionnaire administered immediately after each experi- 
ment revealed no differences between treatment groups with re- 
spect to interest and morale. The college group was signifi- 
cantly lower than the two high school groups in interest and 
morale. All groups Giairaed to have worked conscientiously 
throughout and to hciVe tried to do their best most of the time. 
It was noticed that the junior high school group appeared to 
enjoy mimicry for its own sake, whereas continuous mimicry was 
boresome and monotonous to the college group. 



The relative achievement of the groups was compared and its 
significance tested on each of the four criterion tests by means 
of analysis of covariance. A series of twenty-four analyses for 
each of the two variables, each test, and each experiment found 
only one set of significant differences, and this finding was 
judged to be a random variation. There was considerable varia- 
tion in treatment standings from experiment to experiment. This 
was judged to be due to random variations in group selection pro- 
duced by the fact that the Aptitude Test did not measure a dif- 
ferential factor for improvement over Aptitude standing. In 
spite of variations in standing from experiment to experiment, 
the AF treatment averaged high for all experiments and the SD 
low. SD was dropped from further analysis on the ground that 



78 



its low standing might be an experimental error occasioned by 
a slight inferiority in sound quality on the part of the SD 
equipment • Indirect evidence that the SD group suffered a han- 
dicap in taking tests was derived from the data of the continua- 
tion Experiment. 

Two-way analyses of covariance were performed for each cri- 
terion test comparing standings for the IP, AP, and LD groups 
and the three experimental groups. On twelve analyses, namely 
for the Pre-tests, Post-tests, and Pinal Trained for the Phone- 
raic. Overall, and Combined variables, AP was markedly superior 
to LD and IP and LD was moderately superior to IP. On the Pinal 
Trained AP was superior only on the part of the test adminis-? 
tered with activated headphones. It was judged that these con- 
sistent differences could be due to random group variations in 
the differential factor for improvement or to some systematic 
experimental variable. On the ground of three marginal findings 
of statistical significance and a fourth significant difference 
found between AP and IP alone, it was judged that the most proba- 
ble reason for AP superiority was in part, at least, that AP 

provided' better cues, making for better perfofmar.ce in taking 
tests. On the basis of convergent considerations, it was judged 
that the consistent superiority of LD over IP was probably due 
to random group selection. 

On the Pinal Untrained for all three variables AP stood 
first, IP second, and LD third, but the variance was small and 
statistically ncn-significant. AP was superior to the other two 
cn the part administered with activation but inferior on the part 
not so administered. It was judged that, in spite of better 
performance during training, the AP condition did not display 
superiority under che conditions of this experiment as far as 
actual learning to pronounce is concerned. It was judged that 
the decrement in LD performance from the training tests to the 
Final Untrained might point to a somewhat lower efficiency of 
the LD condition for generalized learning to pronounce. 

It was concluded that the experiment had failed to demon- 
strate any differences between treatments in efficiency for 
learning to pronounce except for possible lower efficiency on 
the part of the long delay condition. 

The analyses of covariance revealed a marked and statisti- 
cally significant deficiency on the part of the college group 
on the Final Trained. It was judged that difficulties which 
Dlder students encounter in learning to pronounce may be due 
more to an inability to retain the results of improvement than 
inability to achieve good pronunciation in the course of a 
practice session. 

Since the improvement from .Aptitude to Pre-tests was about 



79 



‘twice as great as the improvement from Pre-tests to Post— tests, 
and since the college students in the Continuation Experiment 
fell considerably below the levels on the Post-tests than they 
had achieved in the preceding Continuation Experiment, the pre- 
instruction in the Classroom Sessions was judged to be useful, 
not only for producing improvement, but for setting the stage 
for later improvement in laboratory practice. 

Two results of the experiments suggested that relatively 
minor deficiencies in the sound quality of laboratory equipment 
may result in definite lowering of performance. The first was 
the relatively lov; performance of the SD group. The second was 
the failure of the AP group to perform in a superior fashion in 
a Trial Experiment where the sound quality of the activated feed- 
back was inferior to that obtained in the Replication and Con- 
tinuation Experiments. 

The General Discussion in the foregoing section contains 
considerable theoretical analysis of the experimental results 
as well as suggestion as to their relevance to teaching situa- 
tions. There are also several suggestions as to further re- 
search based on the results of the experiments. 



APFSKDIX I 

TESTS AND TRAINING PROGRAMS 
Cont ents 

Page 

Aptitude-Criterion Test, Main Experiments 81 

Training Programs, Replication Experiments — - 82 
Training Programs, Continuation Experiment — — — 85 

Aptitude-criterion Test, Trial Experiment 87 

Training Programs, Trial Experiment — 89 



APTITUDE-CRITERION TEST, MAIN EXPERIMENTS 
(Experiments 4, 5, 6, 7) 



81 



The following two sentences show how each sentence in the 
test was built up. The number (2) after an utterance indicates 
it was presented twice in succession* Underlined phonemes were 
scored; always in the second presentation of the utterance. The 
second presentation of the three syllable utterance and also the 
second presentation of the six syllable utterance were scored 
for approximation of the whole utterance to the French phono- 
logical pattern. 



Un (2) 
bon (2) 
na (.2) 

Un bon 
Un bon a (2) 

Un bon ami 

Un bon ami vient 

Un bon ami vient t6t (2) 



Sur (2) 
les (2) 
monts (2) 

Sur les 

Sur les monts (2) 

Sur les monts chiens 

Sur les monts chiens et 

Sur les monts chiens et daims (2) 



The following lists the sentences in the test. Target 
phonemes are underlined. 



First Half (Presented with Activated Head phones) 



Warm-up sentence, not scored ; 

Je la vois chaque mois d’aout. 

Scored sentences, in order of presentation . 

(1) JJn bon ami vient tSt. 

(2) Sur les monts chiens et daims. 

(3) Ils sont un peu fan«^s. 

(4) &e ce bon preux de I’eau. 

(5) Donne-moi sept tapis mauvesi 

(6) Une jument heurte le sol. 

(7) Conduis du bon cSte. 

(8) Tu as les cheveux plats. 

(9) Claude, je veux un stylo. 

(10) Sa mule peureuse culbute. 

(11) Deux onze livres cinquante. 

(12) C'est la plainte d'aucune sainte. 

(Eight minutes rest between First Half and Second Half) 

Second Half (Presented with Inactivated Headphones.) 



82 



warm~tiP sentence# not scored : 

Le coq jaune chante bien mal. 

Scored sentences : 

(13) A bas, consul affreuaci 

(14) Le gamin fin I'atteint. 

(15) L* ozone a sauve"' Paul. 

(16) Julie veut faire la queue. 

(17) On a via neuf buches jaunes. 

(18) chantons cette belle chanson. 

(19) Au prinjemps il pleut trpp. 

(20) ^11 a su le faire seul. 

(21) Ton oncie plonge jusqu'au fond. 

(22) Jouons bien ^ ton d'un luth! 

(23) C'est combien ce chapeau? 

(24) Jules monte et tombe cinq fois. 



TPAXNXII6 PROGRAMS « REPLICATION EXPERIMENTS 
(Experiments 4 , 5# 7) 

The following shows the program for the first day. The 
same pattern was followed in all succeeding days. The build-up 
pattern for the Pre-test and Post-test was the same as in the 
Aptitude-Criterion Test (q»v.) and both were scored in the same 
way. The target phonemes are underlined. 

After the Pre-test was administered^ the Training Series 
was presented twice# with an eight-minute rest between the 
first and second presentations. Then the Post-test warm-up 
and Post-test were given. 

The number (2) after an utterance indicates it was pre- 
sented twice in succession. Target phonemes are underlined., 

(Read both columns to horizontal line.) 



Pre-test and Post-test 



Claude (2) 
je (2) 
veux (2) 
Claude je 



A (2) 
bas (2) 



con (2) 



Claude je veux 
Claude je veux 
Claude je veux 
Claude je veux 



un 

un sty 
un stylo. (2) 



( 2 ) 



A bas 

A bas# con (2; 

A bas# consul 

A bas, consul a 

A bas# consul affreux! (2) 



83 



Training Series., Warm-up Words 



dis (2) 


dos (2) 


tir (2) 


tSt (2) 


de (2) 


doux (2} 


th^ (2) 


tout (2) 


damner (2) 


deux (2') 


tas (2) 


teuton (2) 


dot (2) 


du (2) 


tort (2) 


tu (2) 



Training Series ^ 


, Sentences 

mm mn t 


Claude (2) 




A (2) 


je (2) 




bas (2) 


veux (2) 




con (2) 


Claude je veux 


(2) 


A bas, con (2) 


un (2) 




sul (2) 


sty (2) 




a (2) 


lo (2) 




ffreux (2) 


un stylo (2) 




sul affreux (2) 


Claude je (2) 




A bas, (2) 


Claude je veux 


(2) 


A bas, con (2) 


un sty (2) 




sul i2l . * 


\m stylo (2) 




sul. .affreux • <27. . 


Claude je veux 


(2) 


A baby ikcon <25 


un stylo (2) 




sui. afffdux' (2) 


Claude je veux 


(2) 


A bas, con*.' (2) . . 


un stylo (2) 




sul...affreux‘i2) . 


Claude je veux 


(2) 


A bas ;« con .(27 -.; 


un stylo (2) 




sul affreu« .<2) 




Post-test Warm-up 


Claude (2) 




A bas (2) 


Claude je veux 


(2) 


A bas, con (2) 


Claude je veux 


un (2) 


A bas, consul (2) 


Claude je veux 


un sty (2) 


A bas, consul a (2) 


Claude je veux 


un stylo (2) 


A bas, consul affreux i (2) 



The following shows the warm-up words and sentences for 
each day. Numbers preceding the sentences indicate their nuia 
ber in the Aptitude-Criterion Test. The target phonemes are 
underlined. 



o 



84 



First day 

Warm-up words » dis, tir# d^* th4# damner^ tas# dot# tort# 
dos# t&t# doux# tout, deux# teuton# du# tu 

Sentences t (9) Claude je veux un stylo* 

(13) A bas# consul affreux* 

second ^ 

warm-^up words * bis# pis# beret# paix# bas# pas# botte# 
porte# beau# paume# doux# pou# boeufs# peu# bu# du 

Sentences t (5) Donne-itioi sept^ tapis mauvesi 

(17) On a va. neuf bfitches jaunes. 

Warm-up words t lit# riz# les# raie# la# rat# lotte« rcx 7 # 
lot# rale# loup# roue# pleut# creuse# lu# rue 

Sentences s (1) Un bon ami vient tdt« 

(21) Ton onclp plonge jusqu'au fond. 

Fourth day 

Warm-up words t banc, gland# grand# tant# tante# bon# bonoe# 
son# sombre# pin# gain# dinde# feindre# timbre, brun# 

Verdun 

Sentences : (3) Ils sont un peu fanes. 

(23) c"^est Gombien ce chapeau? 

Fifth day 

Warm-up words t patte# tante# robe# rude# deja# teinte# 
crever# neuf# dogue# pire# bien# coq# seul# dame# ane, 
peigne 

Sentences t (7) Conduis du, bon cSte. 

(15) L' ozone a sauve Paul. 

Sixth day 

Waifm-up words s cSte# tort# trompe# peur# humble# creuse# 
tape# rond# rosse# champ# tour# rive# nuage# poele# zelo# 
Jean 

Sentences t (19) Au printemps il pleut trop. 

' (11) Deux a ^nze livres cinquante. 



TRAINING PROGRAMS » CONTINUATION EXPERIMENT 

(Experiment 6) 

The following shows the program for the first day, the 
same pattern was followed for all six days# The method of test* 
ing and scoring and the schedule of presentation was the same 
as for the Replication Experiment (q*v.). Pour, instead of two 
sentences, were presented each day and the warm-up words were 
omitted. 



The number (2) after an utterance indicates it was presen- 
ted twice in succession. Target phonemes are underlined. 

Pre-test and Post-test 



Claude (2) 
je (2) 
veux (2) 

Claude je 
Claude je veux (2) 

Claude je veux un 
Claude je veux un sty 
Claude je veux un stylo. (2) 

A (2> 
bas (2) 
con (2) 

A bas 

A bas, con (2) 

A bas, consul 

A bas, consul a 

A bas, consul affreuxi (2) 



Chan (2) 
tons (2) 
cette (2) 

Chantons 

Chantons ce;^e (2) 

Chantons cette belle 
Chantons cette belle chan 
Chantons cette belle chanson. (2) 

II iiS 

a (2) 
s]£ (2) 
tl a 

II a as. (2) 

II a su le 

II a su le faire 

II a su le faire seule. (2) 



Training Series 



Claude (2) 

je (2) 

veux (2) 

Claude je veux (2) 
un (2) 
sty (2) 
lo (2) 

un stylo (2) 

Claude je veux (2) 
un stylo (2) 



Chan (2) 
tons (2) 
cette (2) 

Chantons cette (2) 
belle (2) 
chan (2) 
son (2) 

belle chanson (2) 
Chantons cette (2) 
belle chanson (2) 



A (2) 
bas (2) 
con (2) 

A bas, con (2) 
sul (2) 
a (2) 

ffreux (2) 
sul affreux (2) 
A bas, con (2) 
sul affreux (2) 


XI (2) 
a C2) 
su (2) 

XI a su (2) 
le (2) 
faire (2) 
seule (2) 

le faire seule (2) 
XI a su (2) 
le faire seule (2) 




Claude je vei*x 
un stylo 
Claude je v”-ax 
un stylo 


Chantons cette 
belle chanson 
Chantons cette 
belle chanson 




A bas, con 
sul affreux 
A bas, con 
sul affreux 


XI a su 
le faire seule 
XI a su 

le faire seule 




Claude je veux 
un stylo 


Chantons cette 
belle chanson 




A bas, con 
sul affreux 


XI a su 
le faire seule 





waym«»uo for PoBt**te9_t 



Claude (2) 

Claude je veux (2) 

Claude je veux un (2) 

Claude je veux un sty (2) 
Claude je veux un stylo (2) 

A bas (2) 

A bas, con (2) 

A bas, consul (2) 

A bas, consul a (2) 

A bas, consul a££reuxS(2) 



Chantons (2) 

Chantons cette (2) 

Chantons cette belle (2) 

Chantons cette belle chan (2} 
Chantons cette belle chanson (2) 

XI a (2) 

XI a su (2) 

XI a su le (2) 

XI a su le £aire (2) 

XI a su le faire seule (2) 



The £ollowing lists the sentences for each day* Numbers 
preceding the sentences indicate their number in the Aptitude- 
Criterion Test. The target phonemes are underlined. 







87 



First day : (9) 

(13) 
(18) 
( 20 ) 



Claude je veux un stylo. 

A bas, consul affreux! 
Chantons cette belle chanson. 
II a su le faire seule. 



Second day ; 



(5) Donne moi sept tapis mauves. 

(17) On a yu neu£ Siches jaunes. 

(2) Sur les monts chiens et daims. 
(12) C'est la plainte d'aucune sainte. 



Third day : (1) 

( 21 ) 
(16) 
(14) 



Un bon ami yient tdt. 

Ton oncle plonge jusqu'au fond. 
Ju^ie veut faire la queue. 

Le gamin fin I'atteint. 



Fourth day : 



(3) Ils sont un peu fanes. 

(23) C*est combien ce chapeau. 

(24) Jules monte et tombe cinq fois. 
(22):? Jouons bien au ton d*un luth. 



Fifth day : 



(7) Conduis du bon cote. 

(15) L* ozone a sauye Paul. 

(4) 8te ce bon preux de I'eau. 
(10) Sa muLe peureuse culbute. 



Sixth day : 



(19) Au pr intemps il pleut trop. 
(11) Deux onze liyres cinquante. 
(6) Une jument heurte le sol. 

(8) Tu as les cheveux plats. 



APTITUDE-CRITERION TEST, TRIAL EXPERIMENT 

(Experiment 3) 



The following shows the two patterns of build-up used in 
this test, type A and type B. Utterances followed by (2) were 
repeated a second time. Slant marks indicate a brief pause be- 
tween syllables. For Type A, the second presentations of the 
second and final utterances in the build-up were scored for ap- 
proximation of the whole utterance to the French phonological 
pattern. For Type B, the fifth and final utterances were so 
scored. For both types the two underlined phonemes were scored 
in the second presentation of the final utterance. 



Type A Build-up 

Joue la (2) 

Joue la reine (2) 

Joue la reine de (2) 

Joue La reine de coeur . (2) 



Type B Build-up 

Le (2) 

DOS (2) 

Du (2) 

Le / dos / du (2) 

Le dos du 

Le dos du beau (2) 

Le dps du beau bpbe. (2) 






88 



The following are the sentences usefi in the test. The let 
ter preceding an utterance indicates the type of build-up for 
that utterance. The numbers preceding scored utterances indi- 
cate their order in the test. The target phonemes are under- 
lined. 



First Half (Presented with Activa ted Headphones.) 
warm-up sentences# no t scored; 

(A) Ami# qui joue ici? 

(A) Un sou pour ces joujoux. 

(A) Voila un gros pot d'eau. 

(B) Celle que j'aime c'est sa soeur. 

(B) Ton teint est tr^s laiteux. 

(B) Trois enfants font des bonds. 

Scored sentences : 

(1-B) Le dos du beau b^b4. 

(2-B) Les cimes des monts sont hautes. 

(3-B) Le chou rouge et le riz. 

(4-B) Ces deux jeux sont fameux. 

(5-B) Ce nain se met dans le coin,. 

(6-B) Le chien blanc a bien faim. 

(7 -A) Joue ia reine de^co^r , 

(8-A) La moto arrive tot. 

(9-A) II. court dans la rue. 

(10-A) Une fleur jaune n'est pas belle. 

(11-A) II est fort comme un boeuf. 

(12-A) II tonibe sur le menton,. 

Second Half (Presented with Inactivated Headphones.) 

Warm-up sentences » not scored ; 

(A) Paul joue a la pelotte.^ 

(A) Laissez ces p*tits bebes. 

(A) II aime bien le bon vin. 

(B) Lucien avait bien faim. 

(B) II n*ya rien dans deux coins. 

(B) Voici le hibou rouge. 

Scored sentences : 

(13-B) Buvez du the chaud. 

(14-B) Donnez-nous des be^x mots. 

(15-B) Chariot veut faire la queue. 

(16-B) Helene a peur du boeuf. 

(17-B) Bois un verre de liqueur. 

(18-B) La dame blonde danse toute seule. 



o 







89 



(19--a) Un morceau de gateau, 

(20-A) Qui veut un hon cafe? 

(21-A) L* enfant chante une chanson. 
(22-A) II a pu voir la lune. 

(23-A) I*es grosses poiranes dans le se^. 
(24-A) Cinq rats trouvent du pain noir. 



TRAINING PROGRAMS, TRIAL EXPERIMENT 
(Experiment 3) 

The following shows the program for the first day. The 
same pattern was followed in all succeeding days. The entire 
program, including the Pre-tests was presented twice with an 
eight-minute intermission between presentations. Then the Re- 
view and Post-test was presented. The Pre-tests were scored for 
the first presentation. An (S) preceding an utterance indicates 
that it was scored for approximation of the whole utterance to 
the French phonological pattern. The scored phonemes are under- 
lined. 

(Read both columns to horizontal line) 

PrQ-.test. First Utterance 



les 

cimes 

des 

monts 

sont 

hautes 

les 

cimes 

des 



monts 

sont 

hautes 

les / cimes / des 
monts / sont / hautes 
(S) les cimes des 

les cimes des monts (2) 

(S) les cimes des monts sont hautes (2) 



Build-up. First- Utterance 



les (2) 
cimes (2) 

mm M % 

aes vzj 

les / cimes / des (2) 
monts (2) 
sont (2) 
hautes (2) 

monts / s^nt / hautes 



les / cimes / des 

monts / sont / hautes 

les / cimes / des 

monts / sont / hautes 

les cimes des (2) 

les cimes des monts (2) 

les cimes des monts sont hautes (2) 




90 



Pre-test. Second Utterance, 



la 




mo 




to 




a 




rrive 

tot 




la 


(s) 


mo 




to 


(S) 



a 

rrive 

tfit 

Xa / mo / to 
a / rrive / tot 
la / mo / to 
la moto 

la moto arrive (2) 
la moto arrive t^t 



( 2 ) 



Build-up» Second Utterance 



la (2) 
mo (2) 
to (2) 

la / mo / to (2) 



a (2) 
rrive (2) 
tSt (2) ^ 

a/ rrive / tot (2) 



la / mo / to 

a / rrive / tot 

la / mo / to 

a / rrive / t^t 

la moto (2) 

la moto arrive (2) 

la moto arrive t^t (2) 



Build-up. Alternating Utterances 



les /.cirnes / des 
lest eimes des (2) 
les cimes desc'jnonts. iZ) 
les'cime^ des monts> sent 
les ’ cxmes . des monte/ aont 

la / mo / to 
la moto (2) 
la moto arrive (2) 
la moto arrive t6t (2) 

les cimes des 
les cimes des monts 
les cimes des monts sont 
les cimes des monts sont 



la moto 

la moto arrive 

la moto arrive tSt 

( 2 ) : 

hautes<2)les cimes des monts 

les cimes des monts sont hautes 
la moto arrive 
la moto arrive 

les cimes des monts sont hautes 
la moto arrive tdt 
les cimes des monts sont hautes 
la moto arrive i^6t 
les cimes des monts sont hautes 
hautes la moto arrive t6t 



91 



Review and Po3t«"test 



les / ciities / des (2) (S) 

monts / sont / hautes (2) 

les / cimes / des 

monts / sont / hautes (S) 

les / cimes / des 

monts / sont / hautes 

la / mo / to (2) 
a / rrive / tot (2) 

la / mo / to 

a / rrive / t&t (S) 

la / mo / to (S) 

a / rrive / t^t 



les cimes des 

les cimes des monts 

les cimes des monts sont hautes 

la moto 

la moto arrive 

la moto arrive tot 

les cimes des monts sont hautes 
la moto arrive tSt 
les cimes des monts sont hautes 
la moto arrive t^dt 
les cimes des monts sont hautes 
la moto arrive t^ 



The following shows the sentences for each day. Numbers 
preceding sentences indicate their number in the Aptitude-Cri 
terion Test. The target phonemes are underlined. 



First day: 


(2) 

(8) 


Les cimes des monts sont hautes. 
La moto arrive tot. 


Second Dav: 


(14) 

(20) 


Donn<ez-nous des beaux mots. 
Qui veut un bon caf4? 


Third Dav: 


(4) 

(10) 


Ces deux jeux sont fameux. 

Une fleur jaune n*est pas belle. 


Fourth Dav: 


(16) 

(22) 


Helene a peur du boeuf . 
11 a pu voir la lune. 


Fifth Dav: 


(6) 

(12) 


Le chien blanc a bien faim. 
11 tombe sur le men ton. 


Sixth Dav: 


(18) 

(24) 


La dame blonde danse toute seule 
Cinq rats trouvent du pain noir. 



92 



APPENDIX II : QUESTIONNAIRE 

In the following questions circle the letter before the answer 
which comes closest to your feeling or belief: 

1. I was in the following group 

a. Short delay 

b • Activated 

c. Inactivated 

d. Long delay 

2. iJ^hat I learned in this experiment 

a. Will never be of any value to me, 

b. May be of some value to me, 

c. Will definitely be of value to me. 

3 a. All the work I did in this experiment was interesting. 

b. Some of the v»ork I did in this experiment was interesting. 

c. None of the work I did was interesting. 

4. a. None of the work was boring. 

b. Some of the work was boring. 

c. Some of the work was very boring. 

5. a, I did my best almost all of the time. 

b. I did my best more than half the time. 

c. I did my best less than half the time. 

d. I wasn't really trying any of the time. 

Write brief answers to the following questions: 

1. What parts of the experiment were most interesting? 

2. I'Jhat parts of the experiment bored you? 

3. What things about the experiment irritated you? 

4. What bothered you so that you couldn't do your best work? 

How important was this? 

5. How would you advise us to change the way we went about 
the experiment? 

6. What did you like about the way we went about the experiment? 



o 



93 



APPENDIX III 

ANALYSIS OF RESPONSES TO OPEN-ENDED ITEMS ON QUESTIONNAIRE BY 



EXPERIMENT AND 


TREATMENT 








— r — 






Ex4 


Ex5 


Ex6 


Ex7 


Total 


Item 1: What interesting? 


a; . Testing, pre«»ppstrtests) ) IF 


3 


4 


4 


2 


• 

13 


or observing own progress 


AF 


1 


1 


2 


2 


. 6 


LD 


3 


1 


1 


1 


6 




SD 


2 


5 


2 


- 


9 




Total 


9 


11 


9 , 


5 


34 


B. Playback (Ss in IP and AF were IP 


2 


— 




1 


3 


allowed to hear their recording AP 


2 


— 


— 


— 


2 


played back after the final 


LD 


5 


1 


2 


3 


11 


test) 


SD 


3 


1 


2 


4 


10 


Total 


12 


2 


4 


8 


26 


C. Classroom work 


IP 


3 


1 


««i 




4 




AP 


4 


- 


1 


- 


5 




LD 


1 


- 


- 


1 


2 




SD 


1 


- 


- 


1 


2 




Total 


9 


1 


1 


2 


13 


D. Learning new sounds or pro- 


IP 




1 


3 


1 


5 


nunciation of new language 


AP 


— 


1 


1 


1 


3 




LD 


- 


1 


— 


9SS$ 


1 




SD 


1 


- 


- 


1 


2 




Total 


1 


3 


4 


3 


11 


E, Everything was interesting 


IP 




1 




1 


2 




AP 


- 


1 


- 


2 


3 




LD 


- 


- 


Mi 


- 


- 




SD 


1 


- 


- 


1 


2 




Total 


1 


2 


- 


4 


7 


P. Nothing was interesting or 


no IP 




1 




• 


1 


part more interesting than 


AP 


- 




1 




1 


another 


LD 


1 






— 


5 




SD 


- 


- 




- 


— 




Total 


1 


3 


3, 


- 


7 


G. Practice or training sessions IP 


2 






Mi 


2 




AP 


- 


1 


- 


- 


1 




LD 


- 


- 


- 


1 


1 




SD 


1 


- 


- 


1 


2 




Total 


3 


1 


- 


2 


6 




94 



Ex4 ExS Ex6 Ex7 Total 



H. Miscellaneous 


IP 


2 






3 


5 




AP 


- 


5 


4 


1 


10 




liD 


- 


3 


3 


- 


6 




SD 


1 


2 


2 


- 


5 




Total 


3 


10 


9 


4 


26 



Item 2: What was boring? 



A. Repetitiousness 


IP 


3 


3 


4 


2 


12 




AP 


1 


2 


2 


2 


1 




IiD 


2 


6 


5 


2 


15 




SD 


1 


5 


1 


1 


8 




Total 


7 


16 


12 


7 , 


42 


B. Second practice session 


IP 


2 


1 


1 


2 


6 


(Playback for LD) or length 


AP 


1 


1 


2 


1 


5 


of practice 


LD 


1 


1 


1 


3 


6 




SD 


2 




— 


1 


3 




Total 


6 


3 


4 


7 


20 


C. Nothing was boring 


IP 


1 




1 


- 


2 


AP 


- 


- 


1 


3 


4 




LD 


3 


1 


- 


2 


6 




SD 


2 


- 


- 


4 


6 




Total 


6 


1 


2 


9 


18 


D. Practice Sessions (without 


IP 


2 


2 


2 


2 


8 


qualification) 


AP 


2 


— 


•• 




2 


LD 


1 




X 


1 


3 




SD 


— 




4 


- 


4 




Total 


5 


2 


7 


3 , 


17 


E. Various parts of instructions, IP 




1 


- 


- 


1 


testing machines , warm-ups 


AP 


1 




1 


— 


2 


for tests 


LD 


- 


— 


— 


— 


■M 




SD 


1 


- 


2 


- 


3 




Total 


2 


1 


3 


- 


6 


P. Miscellaneous 


IF 


2 




— 


2 


4 




AP 


1 


4 


- 


- 


5 




LD 


3 


- 


- 


1 


4 




SD 


2 


2 


- 


- 


4 




Total 


R 


6 


- 


3 


17 



Item 3: What irritated? 



A. Nothing irritated 


IP 


3 


3 


2 


2 


9 


AP 


3 


3 


2 


6 


14 




LD 




1 


3 


4 


8 




SD 




1 


2 


5 


8 




Total 


6 


7 


9 


17 


39 



o 



95 



Ex4 Ex5 Ex6 Ex7 Total 



B. Various features of instruct- 


IP 


3 


3 




Mi 


6 


tions and testing equipment 


AP 


1 


3 


3 


- 


7 




LD 


1 


2 


1 


- 


4 




SD 


mm 


7 


«. 


- 


7 




Total 


5 


15 


4 


- 


24 


C. Mechanical failures or 


IP 






mm 


mm 


Mi 


variations in loudness 


AF 


1 


- 


- 


- 


1 




LD 


1 


- 


- 


1 


2 




SD 


8 


- 


- 


1 


9 




Total 


10 


- 


- 


2 


12 


D. Various causes of 


IP 


mm 


1 


2 


_ 


3 


boredom 


AP 


1 


- 


1 


- 


2 




LD 


- 


2 


1 


- 


3 




SD 


- 


- 


3 


- 


3 




Total 


1 


3 


7 


- 


11 


E. Not knowing the meaning 


IP 


2 


— 


«. 


mm 


2 


of the sentences 


AP 


- 


- 


— 


- 


— 




LD 


5 




- 


- 


5 




SD 


1 


— 


- 


- 


1 




Total 


8 


- 


- 


- 


8 


P. Subject's own failures 


IP 


1 






3 


4 




AP 


1 


- 


- 


- 


1 




LD 


- 


- 


- 


1 


1 




SD 


- 


- 


- 


1 


1 




Total 


2 


- 


- 


5 


7 


G. Delays; Waiting for late- 


IP 




1 


2 


mm 


3 


comers, rest periods 


AP 


- 


- 


mm 


- 


— 




LD 


- 


1 


- 


- 


1 




SD 


- 


2 


- 




2 




Total 


- 


4, 


2 


- 


6 


H. Uncomfortable headphones 


IP 


Ml 


mm 


Ml 


M* 


Ml 




AP 


mm 


1 


1 


1 


3 




LD 


- 


2 


- 


- 


2 




SD 


- 


« 


- 




- 




Total 




3 


1 


1 


5 


I. Miscellaneous 


IP 


mm 


mm 


1 


2 


3 




AP 


1 


- 


1 


- 


2 




LD 


1 


2 


1 


1 


5 




SD 




1 


- 


- 


1 




Total 


2 


3 


3 


3 


11 



Ex4 



4g X^at bothered? 



A. Nothing bothered so as to 
prevent best work 



IF 

AP 

LD 

Rn 



3 

1 

3 

1 




B. External disturbances, others IP 

voices, movements of proctors, AF 
self-consciousness when hD 

others heard 



1 

4 

3 

4 



Total 12 



C. Difficult material, remember- IP 

ing long sentences, discrimi- AP 

nating sounds, frustration I*D 

at failure 



D. Repetitiousness, boredom, 
mind wandering, losing 
interest 



B. Fatigue, drowsiness, 
yawning 



F. Equipment breakdown or 
malfunction 



G. Miscellaneous 



Total ’ 1 



IP 

AP 

LD 

SD 



Total 



IP 

AP 

LD 

SD 



Total 



IP 

AP 

LD 

SD 



Total 



IP 

AP 

LD 

SD 



1 

1 

1 

2 




H. Difficulty specified as 
being "important", "fairly 
important", "important when 
it happened" 



IP 

AF 

LD 

SD 



97 



Ex4 Ex5 Ex6 Ex7 Total 



I. Difficulty specified as 


IP 


1 




1 


4 


6 


being not important 


AP 


1 


2 


1 


mm 


4 


LD 


3 


mm 


— 




3 




SD 


2 


-i 


2 


3 


7 




Total 


1 


2 


4 


7 


20 



Item 5; Suggested changes 



A. No suggestion or good as 


IP 


2 


2 


3 


4 


11 
*1 A 


it is 


AP 


3 


3 


4 


4 


14 




LD 


2 


4 


4 


3 


13 




SD 


4 


- 


- 


4 


8 




Total 


11 


9 


11, 


15 


46 


B. Give meanings or show 


IP 


4 


1 


1 


- 


6 


words visually 


AP 




"" 


mm 


■M 






LD 


3 


- 


— 


MM 


3 




SD 


2 


- 


- 


- 


2 




Total 


9 


1 


1 


- 


11 


C. Suggestions for improvement 


IP 


1 


2 


2 


1 

M 


6 


of equipment and facilities 


AP 


2 




MM 


1 


3 




LD 


— 




MM 




•M 




SD 


- 


mm 


- 


WW 


MM 




Total 


3 


-2_ 


2 


2 


9 


D. Decrease repetitiousness 


IP 


- 


1 


2 


- 


3 


of practice material 


AP 


— 




«w 


MM 


MW 




LD 


- 


1 


1 


1 


3 




SD 


- 


2 


- 


- 


2 




Total 


- 


4 


3 


1 


8 


E. Omit or change classroom 


IP 




2 


- 


- 


2 

M 


work or (in Ex6) include 


AP 


«■» 


1 


— 


mm 


X 

M 


classroom or perform func- 


LD 


— 


MW 


1 


mm 


1 


tions of classroom 


SD 


mm 


2 


2,, 


^ .. 


4 




Total 


M» 


5 


3 


- 


8 


P. Make instructions less 


IP 




mm 


- 


- 


- 


repetitious , less "child- 


AP 




2 


1 


mm 


3 


like” or let students start 


LD 


1 


1 


MM 


mm 


2 


machines 


SD 


- 


2 






2 




Total 


1 


5 


1 




7 


G. Miscellaneous 


IP 


1 


1 


1 


2 


5 




AP 


1 


1 


1 


1 


4 




LD 


2 


2 


2 


3 


9 




SD 


2 


4 


2 


3 


11 




Total 


6 


B 


6 


9 


29 



o 



98 



Ex4 Ex5 Ex6 Ex7 Total 



Item 62 What like? 


A. Efficiency, good organization 
good planning, no time wasted 


, IP 
AF 


1 

1 


4 

2 


5 

0 


1 

•1 

X 


11 

pm 

/ 


LD 


— 


3 


3 


«MI 






SD 


3 


3 


3 


- 


9 




Total 


5 


12 


14 


2 


33 


B. 8 minute rest pause 


IP 


3 




i- 


1 


4 


AF 


2 


1 


- 


1 


4 




LD 


2 


mm 


1 


2 


5 




SD 


4 




MM 


1 


5 




Total 


11 


1 


1 


5 


18 


C. Friendliness, cheerfulness, 


IP 


1 


1 


1 


3 


6 


helpfulness of staff 


AF 


2 


1 


1 




4 


LD 


2 


- 


«M 


1 


3 




SD 


2 


«*P 


1,„ 


1 


4 




Total 


7 . 


2 


3 


5 


17 


D, Classroom work 


IP 


2 




mm 


3 


5 




AF 


2 


mm 


mm 


MM 


2 




LD 


3 


mm 


- 


MM 


3 




SD 


2 


1 


mm 


1 


4 




Total 


9 


1 


MM 


4 


14 


B. Laboratory sessions or 


IP 


3 


mm 




- 


3 


practice sessions or tests 


AF 




mm 


mm 


2 


2 


or working with machines 


LD 


1 


mm 


mm 


2 


3 


SD 


2 


1, , 


MM 


1 


4 




Total 




1 


- 


5 


12 


P. Blank or "nothing in 


IP 




1 


1 


M 


2 


particular" 


AF 




1 


•M 


1 


2 


LD 




4 


2 


MM 


6 




SD 


•m 


2 , 


MM 


1 


3 




Total 


mm 


8 


3 


2 


13 


G. Everything 


IP 


mm 




MWM 


- 


•M 


AF 


1 




1 


1 


3 




LD 


mm 


mm 


MM 


mm 


- 




SD 


2 


1 


1 


mm 


4 




Total 


3 


. 1 


2 


1 




H. Playback or hearing own 


IP 


1 




mm 


mm 


1 


voice 


AF 


- 


mm 


mm 


mm 


M« 




LD 


2 


MM 


mm 


2 


4 




SD 


mm 


* 


2 


« 


2 




Total 


3 




2 


2 


7 



99 








Ex4 


Ex5 


Ex6 


Ex7 


Total 


I. Was interesting or "made 


IP 












interesting" 


AF 


1 


- 


1 


1 


3 




LD 


- 


- 


- 


- 


— 




SD 


2 


1 


1 


- 


4 




Total 


1 


1 


1 


3 


6 


K. Miscellaneous 


IP 


1 


1 


mm 


3 


5 




AP 


2 


2 


3 


2 


9 




LD 


1 


- 


1 


2 


4 




SD 


4 


- 


4 


1 


9 




Total 


8 


3 


8 


8 


27 















APPENDIX IV 
SUPPLEMENTARY TABLES 
Contents 



TABLE A 
TABLE B 
TABLE C 
TABLE D 
TABLE E 
TABLE F 
TABLE 6 
TABLE H 
TABLE I 
TABLE J 
TABLE K 
TABLE L 



Page 

101 

102 

102 

103 

104 

105 

105 

106 

107 

108 
109 
109 







100 






TABLE M 



110 





101 



TABLE A 

RELIABILITY COEFFICIENTS DERIVED FROM ANAI/YSIS OF THE 
APTITUDE TEST FOR ALL THREE REPLICATION EXPERIMENTS 



(Underlined split-half coefficients are corrections by the 
Spearman-Brown prophecy formula. Underlined coefficients for 
correlations between raters are corrections for attenuation 
by the Spearman formula.) 





Ex4 


Ex5 


Ex7 


N = 


28 


27 


28 


Split-half coefficients, individual raters: 




.85 


.54 


.77 


Rater C, Phonemic 


.92 


.70 


.86 




.78 


.73 


.65 


Rater S , Phonemic 


.88 


.84 


.79 




.90 


.88 


.91 


Rater C, Overall 


.95 


.94 


.95 




.87 


.78 


.80 


Rater B, Overall 


.93 


.88 


.89 


Split-half coefficients, both rater’s 
scores combined: 




.86 


.70 


.77 


Phonemic 


.94 


.83 


.86 




.92 


.85 


.92 


Overall 


.96 


.92 


.96 




.93 


.84 


.89 


Combined Phonemic & Overall scores 


.96 


.91 


.94 


Correlations between raters: 


Rater C x Rater S, Phonemic 


.89 


.81 


.82 




.99 


1.05 


1.00 


Rater C x Rater B, Overall 


.93 


.87 


.88 




.99 


.96 


.96 



TABLE B 



102 



mean scores of raters on aptitude, final trained and final 

UNTRAINED, WITH DIFFERENCES BETWEEN. RATERS AND SELF-DIFFER- 
ENCES BETWEEN MEANS OF ODD ANu EVEN ITEMS ON THE APTITUDE TEST. 




special abbreviations: C, S, and B; Raters C, S, and B 

D; differences between raters or self— differences • 

MEANS 2^ND DIFFERENCES BETWEEN RATERS 



Experiment 4 Experiment 5 
Ap FT FU Ap FT FU 
C 690 827 823 704 814 807 

Ph S 634 744 704 611 705 665 

D 56 83 119 93 109 142 

B 792 994 883 941 1062 1028 

OA C 615 856 754 679 831 780 

D 177 138 129 262 231 248 



Experiment 7 



Ap 


FT 


FU 


588 


768 


732 


567 


684 


678 


21 


84 


54 


921 


1085 


1018 


563 


807 


713 


353 


278 


305 



MEANS ON ODD AND EVEN ITEMS WITH SELF-DIFFERENCES 



Ex 4 Ex5 Ex7 

c s c s c 



ODD 691 640 706 628 604 559 




ODD 620 802 683 933 584 921 




TABLE C 



CORRELATIONS BETWEEN PHONEMIC AND OVERALL VARIABLES 




(Underlined coefficients are corrections by Spearman-Bro^ 
prophesy formula to compare criterion test intercorrelations 
with Aptitude Test correlations.) 





N 


Ap 


Pr 


Po 


FT 


FU 


Ex4 


28 


.85 


.83 


.91 


.90 


.95 


.91 


.95 


.86 


.94 


Ex5 


27 


.73 


.71 


.83 


.57 


.73 


.48 


.65 


.64 


.78 


Ex7 


28 


.64 


.63 


.77 


.77 


. 86 


.72 


.84 


.55 


.71 



ERIC 



103 



TABLE D 



INTERCORRELATIONS BET^/fEEN TESTS IN THE REPLICATION EXPERI- 
MENT FOR BOTH OVERALL AND PHONEMIC VARIABLES WITH MEANS OP 
CORRELATIONS. 





E3^eriment 4 


Experiment 5 


Experiment 7 






Pr 


Po 


FT FV 


Pr 


Po 


FT FO 


Pr 


Po 


FT FU 






Ap .77 


.80 


.78 .85 


.38 


.34 


.49 .61 


.72 


.70 


.74 .75 




Ph- 


Pr 


.85 


.86 .87 




.77 


.53 .48 




.67 


.77 .71 




Ph 


Po 




.88 .89 






.56 .42 






.87 .60 






FT 




.85 






.80 






.74 






Ap .84 


.80 


.76 .89 


.64 


.43 


.53 .81 


• 


.53 


.66 .85 




OA- 


Pr 


.90 


.83 .88 




.83 


.75 .66 




.79 


.83 .84 




OA 


Po 




.89 .85 






.87 .63 






.78 .68 






FT 




.85 






.67 






.81 






Ap .80 


.85 


.84 .87 


.50 


.37 


.43 .64 


• 66 


.62 


.73 .71 




Ph- 


Pr 


. 86 


.84 .77 




.57 


.32 .22 




.73 


.69 .74 




OA 


Po 




.90 .81 






.36 .17 






•61 .53 






FT 




.79 






.51 






• 66 






Ap .66 


.67 


.74 .79 


.16 


.24 


.44 .46 


.58 


.41 


.57 .43 




OA- 


Pr 


.80 


.77 .86 




.58 


.62 .63 




.55 


.67 .60 




Ph 


Po 




.86 .90 






.67 .68 






.72 .62 






FT 




.87 






.63 






.68 





Mean Correlation Coefficients# A ll Three Exiaer intents 
(computed through z-transformations) 



Special Abbreviation: Tr; training criterion teats « Pr# Po# FT. 





PhTr 


OATr 


CoTr 




N 


M 


N 


M 


N 


M 


PhAp 


9 


• 66 


9 


.67 


18 


.67 


QAAp 


9 


.52 


9 


.68 


18 


• 61 


CoAp 


18 


.60 


18 


.68 


36 


.64 


PhTr 


9 


.78 


9 


.70 


18 


.74 


OATr 


9 


.71 


9 


.84 


18 


.78 


CoTr 


18 


.74 


18 


.78 


36 


.76 



PhFU 


OAFU 


COFU 


N 


M 


N 


M 


N 


M 


3 


.75 


3 


.76 


6 


.76 


3 


.59 


3 


.77 


6 


.69 


6 


.68 


6 


.77 


12 


.73 


9 


.74 


9 


.62 


18 


.68 


9 


.75 


9 


.78 


18 


.76 


L8 


.74 


18 


.71 


36 


.73 



104 



TABLE E 



MEANS OF TREATMENTS IN THE REPLICATION EXPERIMENTS FOR THE 
PHONEMIC AND OVER-ALL VARIABLES WITH DIS^FERENCES SE*TWEEN APTI 
fDDE**Ail& CRiTERIOiSr’ SCbftES AND WiM WEANS OP EACH EXPERIMENT. 



Special abbreviations: 


DPr, 


DPo, 


DPT, 


DPU; differences 


between 


Aptitude 


Test 


and 


criterion 


test 


scores 


I. 










— 


N • 


AD 


Pr 


PO 


FT- 


FU **• 


DPr V 


DPo 


DPT 


DPU 


P 


IP 


7 


625 


768 


800 


779 


757 


143 


175 


154 


132 


E7Z 


AF 


7 


701 


843 


880 


854 


809 


142 


179 


153 


108 


H 4 


LD 


7 


674 


776 


801 


781 


765 


82 


127 


107 


91 




SD 


7 


649 


704 


763 


729 


723 


55 


114 


80 


74 


0 


Mni 




662 


773 


811 


786 


7i>3 


ill 


149 


124 


101 




IP 


7 


662 


717 


767 


723 


737 


55 


105 


61 


75 


N EX 


AP 


7 


653 


793 


842 


774 


747 


140 


189 


121 


94 


5 


LD 


7 


665 


776 


824 


762 


743 


111 


159 


97 


78 


E 


SD 


6 


64R 


746 


832 


781 


713 


98 


184 


133 


65 




Mm 




657 


758 


816 


760 


735 


101 


159 


103 


78 


M 


IP 


7 


557 


664 


733 


705 


688 


107 


176 


143 


131 


Ex 


AP 


7 


538 


692 


731 


700 


677 


154 


193 


162 


139 


I 7 


LD 


7 


590 


715 


772 


768 


721 


125 


132 


178 


131 




SD 


7 


625 


703 


712 


730 


733 


73 


87 


105 


108 


c 


Mm 




57B 


694 


737 


726 


705 


116 


159 


148 


127 


0 


IP 


7 


691 


815 


887 


902 


815 


124 


196 


211 


124 


Ex 


AP 


7 


701 


863 


954 


974 


838 


162 


253 


273 


137 


V 4 


LD 


7 


732 


826 


893 


939 


824 


94 


161 


207 


92 




SD 


7 


688 


809 


852 


885 


795 


121 


164 


197 


107 


E 


Mm 




703 


828 


897 


925 


818 


125 


194 


222 


115 




IP 


7 


829 


878 


938 


926 


931 


49 


109 


97 


102 


R EX 


AP 


7 


777 


911 


988 


954 


892 


134 


211 


177 


115 


5 


LD 


7 


798 


898 


976 


949 


898 


100 


178 


151 


100 


A 


SD 


6 


838 


919 


954 


958 


895 


81 


116 


120 


57 




Mm 




810 


901 


964 


947 


904 


91 


154 


137 


94 


L 


IP 


7 


752 


872 


936 


951 


871 


120 


184 


199 


119 


Ex 


AP 


7 


699 


835 


915 


905 


820 


136 


216 


206 


121 


L 7 


LD 


7 


745 


884 


960 


958 


890 


139 


215 


213 


145 




SD 


7 


781 


880 


.929 


969 


881 


99 


148 


188 


100 




Mm 




744 


868 


935 


946 


865 


124 


191 


202 


121 



105 



TABLE P 

MEAN GAINS ON THE PRE-TESTS BY DAYS ABOVE SCORES ON THE 
SAME ITEMS ON THE APTITUDE TEST. 



N 


Dav 1 


Dav 2 


Dav 3 


Day 4 


Dav 5 


Dav 6 


Ex4 28 


106 


206 


80 


43 




125 


80 


Ph Ex5 27 


182 


82 


97 


41 




99 


45 


Ex7 28 


120 


136 


129 


136 




54 


98 


Mm 


136 


142 


102 


73 




92 


75 


Ex4 28 


33 


131 


120 


149 




94 


176 


OA Ex5 27 


121 


80 


115 


40 




68 


131 


Ex7 28 


105 


117 


138 


188 




52 


92 


Mm 


86 


110 


124 


127 




71 


133 


Co Mm 


111 


126 


113 


100 




82 


104 






TABLE G 










MEAN PHONEMIC 


, OVERALL, AND 


COMBINED 


SCORES 


BY TREATMENTS 


FOR THE FIRST 


THREE AND LAST 


' THREE DAYS ON 


THE 


PRE- 


TESTS 


WITH DIFFERENCES BETWEEN THEM- 










Special Abbreviations 


5 1st; 


mean of 


first 


three Pre-test 


days. 2nd; mean of last three Pre-test days- 


Dif; 


differ- 


ence between 


1st and 


2nd. 












Phonemic 




Overall 






Combined 


IP AP 


LD SD 


IP 


AP LD 


SD ^ 


IP 


AF 


LD SD 


1st 709 789 


750 717 


854 


865 967 


86c ^^782 


927 


809 789 


2nd 723 763 


761 715 


856 


877 871 


873 


789 


820 


816 794 


Dif —14 +26 


-11 +2 


-2 


-12 -4 


-13 


-8 


+7 


-7 -5 




106 



TiiBliE H 

ADJUSTED MEANS OP TREATMENTS DERIVED PROM ONE-WRY ANALYSIS 
OP COVARIANCE OP EACH CRITERION TEST IN EACH EXPERIMENT 
POR PHONEMIC AND OVERALL VARIABLES IN THE REPLICATION EX- 
PERIMENTS. 



IP 

Pro-* AP 

test LD 

SD 

IF 

Post- AP 

test LD 

SD 

IP 

Pinal AP 

Trained LD 

SD 

IP 

Pinal AP 



Untrained LD 
SD 



Phonemic 



Ex4 


Ex5 


Ex7 


798 


715 


683 


.802 


.'795 


• 729 * 


760 


773 


703 


711 


750 


658 


841 


765 


755 


838 


844 


773 


788 


821 


759 


778 


836 


663 


815 


720 


725 


818 


777 


738 


770 


758 


756 


741 


786 


685 


793 


735 


702 


771 


750 


704 


754 


738 


713 


737 


720 


702 



Overall 



Ex4 


Ex5 


Ex7 


825 


866 


865 


864 


930 


870 


802 


904 


885 


821 


903 


852 


898 


928 


933 


956 


1005 


936 


868 


981 


960 


866 


940 


913 


912 


915 


947 


975 


970 


923 


917 


956 


958 


897 


945 


954 


826 


918 


865 


839 


914 


855 


800 


905 


890 


808 


876 


853 



107 



TABLE I 

TWO-WAY ANALYSES OP COVARIANCE OP IP, AP AND LD TREAT- 
MENTS BY EXPERIMENTS 4, 5, and 7 POR PHONEMIC AND OVER- 
ALL SCORES ON ALL POUR CRITERION TESTS. 



PHONEMIC 



Test 

Pre-tests 


Source 
Treatments 
Experiments 
Interaction 
Within cells 


or 

2 

2 

4 

53 


729 

319 

205 

203 


r 

3.59 

1.57 

1.01 


• 

P<.05 

N.S. 

N.S. 




Treatments 


2 


465 


1.93 


N.S. 


Post-tests 


Experiments 


2 


129 


.54 


N.S • 




Interaction 


4 


303 


1.26 


N.S. 




Within cells 


53 


241 






-- - 


Treatments 


2 


215 


.91 


N.S. 


Pinal 


Experiments 


2 


874 


3.70 


P< .05 


Trained 


Interaction 


4 


267 


1.13 


N.S . 




Within cells 


53 


236 








Treatments 


2 


46 


.21 


N.S^ " 


Final 


Experiments 


2 


387 


1.80 


N.S. 


Untrained 


Interaction 


4 


50 


.23 


N.S. 




within cells 


53 


215 








0VERM.L 








Test 


Source 


d£ 


MS 


P 


Sio. 




Treafenaents 


2 


404 


2.48 


N.S. 


Pre-tests 


Experiments 


2 


90 


.55 


N.S. 




Interaction 


4 


240 


1.47 


Nf # S • 




Within cells 


53 


163 








Treatments 


2 


761 


' 4.43 


P<.05 


Post-tests 


Experiments 


2 


52 


.31 


N.S. 




Interaction 


4 


347 


2.04 


N.S. 




within cells 


53 


170 








Treatments 


2 


411 


2.46 


N.S. 


Final 


Experiments 


2 


644 


3.86 


P< .05 


Trained 


Interaction 


4 


256 


1.53 


N.S. 




within cells 


53 


167 








Treatments 


2 


31 


.25 


N.S. 


Final 


Experiments 


2 


82 


.66 


N.S. 


Untrained 


Interaction 


4 


140 


1.12 


N.S . 




within cells 


53 


125 







108 



TABLE J 

ADJUSTED MEANS OP TREATMENTS AND EXPERIMENTS DERIVED FROM 
TWO-WAY ANALYSIS OP COVARIANCE FOR PHONEMIC AND OVERALL 
VARIABLES WITH MEANS OP MEANS AND DEVIATIONS PROM MEANS OP 
MEANS. 



Pr Po PT PU 







M 


Dm 


M 


Dm 


M 


D 


M 


Dm 


p 


IP 


725 


-22 


780 


-15 


748 


-13 


740 


+2 


H 


Treat- 


773 


+26 


817 


+22 


774 


+13 


743 


+5 


0 


ments LD 


743 


-4 


787 


-8 


760 


-1 


731 


-7 


N 


I4xti 


747 




795 




761 




738 




E 


Ex4 


768 


+21 


793 


-1 


775 


+14 


745 


+7 


M 


Expert- Ex5 


737 


-10 


783 


-11 


728 


-33 


716 


-22 


I 


ments Ex7 


737 


-10 


307 


+13 


779 


+18 


753 


+15 


C 


Mm 


747 




794 




761 




738 






IF 


848 


-17 


915 


-24 


920 


-20 


863 


-2 


0 


Treat- AP 


885 


+20 


965 


+26 


957 


+17 


870 


+5 


V 


ments LD 


861 


-4 


936 


-3 


943 


+3 


861 


-4 


E 


Mm 


865 




939 




940 




865 




R 


Ex4 


862 


••4 


934 


-4 


962 


+22 


861 


-3 


A 


Expert- Ex5 


857 


-7 


935 


-3 


911 


-29 


858 


-6 


L 


ments Ex7 


874 


+10 


946 


+8 


947 


+7 


874 


+10 


L 


m. 


864 








940 




864 





TABLE K 



109 



TWO-WAY MULTIPLE REGRESSION ANALYSES OF COVARIANCE OF IP» 
AF, AND LD TREATI4ENTS BY EXPERIMENTS 4, 5, ?;ND 7 FOR ALL 
FOUR CRITERION TESTS USING COMBINED PHONEMIC AND OVERALL 

SCORES. 



TGStl 


Source 


df 


MS 


P 


Sia. 




Treatments 


2 


1697.3 


2.96 


N.S. 


Pre-tests 


Experiments 


2 


1236.0 


2.15 


N.S. 




Interaction 


4 


257 . 3 


• 45 


N.S. 




Within cells 


52 


574.6 






■ " ■■■ ■ 


Treatments 


2 


1939.8 


3.18 


P<.05 


Post-tests 


Experiments 


2 


281.2 


.46 


N.S. 




Interaction 


4 


1291.6 


2.12 


N.S • 




Within cells 


52 










Treatments 


2 


1117.6 


2.10 


H.S. 


Pinal 


Experiments 


2 


3468.3 


6.52 


P<.01 


Trained 


Interaction 


4 


805.6 


1 • 52 


£7 « S • 




Within cells 


52 


531.6 








Treatments 


2 


253.7 


• 59 


N.S. 


Pinal 


Experiments 


2 


1003 . 8 


2.33 


N.S. 


Untrained 


Interaction 


4 


378.1 


• 88 


N.S • 




within cells 


52 


431.6 






TABLE L 



TWO-WAY ANALYSES OF COVARIANCE OF EXPERIMENTS 4, 5 , AND 7 
BY IP AND AP TREATMENTS AND ALSO BY AP AND LD TREATMENTS 
FOR THE POST-TESTS USING COMBINED PHONEMIC AND OVERALL SCORES. 





Source 


df 


MS 


P 


Sia. 




Treatments 


2 


3454.4 


4.64 


P< .05 


AF and IF 


Experiments 


2 


372.0 


.50 


N.S. 




Interaction 


4 


895.4 


1.20 


N.S. 




within cells 


34 


744.3 








Treatments 


2 


1648.1 


2.23 


N.S. 


AP and LD 


Experiments 


2 


392.4 


• 53 


N.S. 




Interaction 


4 


587.1 


.79 


N.S. 




Within cells 


34 


739.5 







110 



TABLE M 

MEANS OP PHONEMIC AND OVERALL SCOPJIS BY TREATMENTS FOR ALL 
TESTS IN EXPERIMENT 5, THE PART OP EXPERIMENT 6 TRAINED IN 
EXPERIMENT 5, AND THE PART OP EXPERIMENT 6 NOT TRAINED IN 
EXPERIMENT 5 T-JITH MEANS OP TREATMENT MEANS. 



Note: 


In Experiment 6 " 


•In" 


signifies Introductory Test and 






"P" Pinal 


. Test. 
























Experiment 5 




Experiment 6 


Experiment 6 
















Trained 


in 5 


1 


Untrained it 


1 5 






AP 


Pr 


Po 




PU ,, 


In 


Pr 


Po 


P 


In 


Pr 


Po 


P 




IP 


662 


717 


767 


, 

723 


737 


701 


710 


760 


728 


720 


734 


712 


715 


Pit 


AP 


653 


793 


842 


774 


747 


733 


770 


805 


754 


744 


746 


773 


793 


Jb X* 


LD 


665 


776 


824 


762 


743 


770 


769 


782 


780 


763 


754 


735 


759 




sn 


648 


746 


832 


781 


71.3 


735 


769 


782 


756 


704 


720 


732 


754 




Mm 


657 


758 


816 


760 


735, 


735 


754 


732 


754 


734 


739, 


738 


755 




IP 


" 829 


878 


938 


926 


931 


899 


929 


947 


947 


896 


905 


949 


943 




AP 


777 


911 


988 


954 


892 


945 


946 


960 


969 


863 


902 


961 


928 




LD 


798 


898 


976 


949 


898 


940 


948 


945 


950 


887 


901 


936 


938 




SD 


838 


919 


954 


9S»8 


895, 


_943 


955 


939 


974 


879 


884 


949„ 


953 




Mni 


810 


901 


964 


947 


904 


”“932 


944 


948 


960, 


881 


898 


949 


940 




